Compositions and methods for enhancing homologous recombination

ABSTRACT

The present disclosure generally relates to compositions and methods for improving the efficiency of homologous recombination. In particular, the disclosure relates to reagents and the use of such reagents.

CROSS-REFERNCES TO RELATED APPLICATIONS

This application is a division of U.S. application Ser. No. 15/605,586filed on May 25, 2017, now allowed, which claims the benefit of U.S.Provisional Application No. 62/342,504, filed May 27, 2016, thedisclosure of each of which is incorporated by reference in theirentirety.

FIELD

The present disclosure generally relates to compositions and methods forimproving the efficiency of homologous recombination. In particular, thedisclosure relates to reagents and the use of such reagents.

BACKGROUND

A number of genome-editing systems, such as designer zinc fingers,transcription activator-like effectors (TALEs), CRISPRs, and homingmeganucleases, have been developed. One issue with these systems is lowlevels of homologous recombination often requires that numerous cells ofclonal origin be screened to identify cells that have undergonehomologous recombination and have the desired genotype. The generationand identification of cells with the correct genotype is often laboriousand time consuming. In one aspect, the invention allows for theefficient design, preparation, and use of genome editing reagents andgeneration and identification of cells that have been “correctly”edited.

SUMMARY

The present disclosure relates, in part, to compositions and methods forediting of nucleic acid molecules. There exists a substantial need forefficient systems and techniques for modifying genomes. This inventionaddresses this need and provides related advantages.

One aspect of the invention involves the choice of features such asmolecular structures and incubation conditions that result in increasedgene editing efficiency. In some instances, donor nucleic acid moleculesused in the practice of the invention have termini that are nucleaseresistant. This is believed to assist in stabilizing termini againstnuclease action (e.g., against endogenous nucleases).

The invention includes methods for performing homologous recombination.In some aspects, these methods comprise (a) generating a double-strandedbreak in a nucleic acid molecule present inside a cell to produce acleaved nucleic acid molecule, and (b) contacting the cleaved nucleicacid molecule generated in (a) with a donor nucleic acid molecule,wherein the cleaved nucleic acid molecule and the donor nucleic acidmolecule each contain matched termini on at least one end, wherein thematched termini on at least one end of the cleaved nucleic acid moleculeand the donor nucleic acid molecule is at least ten (e.g., from about 10to about 200, from about 10 to about 150, from about 10 to about 100,from about 10 to about 90, from about 10 to about 75, from about 20 toabout 140, from about 30 to about 100, etc.) nucleotides in length, andwherein the matched region of the cleaved nucleic acid molecule issingle-stranded or double-stranded and the matched region of the donornucleic acid molecule is single-stranded. In some instances, the matchedtermini on at least one end of the cleaved nucleic acid molecule and thedonor nucleic acid molecule have 5′ overhangs or 3′ overhangs. In otherinstances, the matched termini on at least one end of the cleavednucleic acid molecule and the donor nucleic acid molecule have one 5′overhang and one 3′ overhang. In specific instances, a pair of matchedtermini is used where the terminus of the cleaved nucleic acid moleculeis blunt and the terminus of the donor nucleic acid molecule has a 3′overhang. Further, in some instances, at least one pair of matchedtermini of the cleaved nucleic acid molecule and the donor nucleic acidmolecule share at least ten (e.g., from about ten to about fifty, fromabout ten to about forty, from about ten to about thirty, from aboutfifteen to about fifty, from about fifteen to about forty, from aboutfifteen to about thirty, etc.) complementary nucleotides. In someinstances, the at least ten complementary nucleotides share at least80%, at least 85%, at least 90%, at least 95%, or 100% sequenceidentity.

A number of compositions and methods may be used to generate cleavednucleic acid. As examples, the nucleic acid molecules present insidecells may cleaved by one or more zinc finger-FokI fusion proteins, oneor more TAL nucleases, one or more CRISPR complexes, or one or moreargonaute-nucleic acid complexes.

Further, cleaved nucleic acid molecules may have at least one terminuswith a single-stranded region. Also, double-stranded breaks in nucleicacid molecules present inside cells may be generated by the formation oftwo nicks, one in each strand of the nucleic acid molecules. Such nicksmay be used to generate cleaved nucleic acid molecules having at leastone blunt terminus. Further, nicks made in cleaved nucleic acidmolecules may be located at a distance selected from the groupconsisting of (a) from about two nucleotides to about forty nucleotides,(b) from about four nucleotides to about thirty nucleotides, (c) fromabout five nucleotides to about twenty nucleotides, and (d) from aboutfive nucleotides to about thirty nucleotides.

The invention also includes compositions and methods related to donornucleic acid molecules comprising one or more nuclease resistant group.For example, the invention includes donor nucleic acid moleculecontaining one or more nuclease resistant groups in at least one strandof at least one terminus. Donor nucleic acid molecule may also containone or more nuclease resistant groups in both strands of both termini.Further, donor nucleic acid molecule contains a single terminalphosphorothioate linkage in both strands of both termini. Along theselines, donor nucleic acid molecule contains two terminalphosphorothioate linkages in both strands of both termini.

The invention also includes compositions and methods related to donornucleic acid molecules having asymmetric termini. By “asymmetrictermini” it is meant that the termini differ in one or more featurerelated to homologous recombination. For example, the lengths of theterminal “matched” regions of sequence complementarity to the targetlocus may be different. Thus, one terminus may have forty nucleotides ofsequence complementarity and the other terminus may have only fifteennucleotides of sequence complementarity. In many instances, one or bothasymmetric termini of donor nucleic acid molecules will be partially orfully single-stranded.

The invention further includes methods for generating donor nucleic acidmolecules containing one or more nuclease resistant group in at leastone strand of at least one terminus. Such methods may comprise (a)generating two single-stranded nucleic acid molecules that share atleast one region of sequence complementarity sufficient to allow for thetwo single-stranded nucleic acid molecules to hybridize to each other,wherein at least one of the two single-stranded nucleic acid moleculescontains at least one nuclease resistant group, and (b) contacting thetwo single-stranded nucleic acid molecules with each other underconditions that allow for hybridization to produce a hybridized nucleicacid molecule. In some instances, the hybridized nucleic acid moleculecontains at least one overhanging terminus and is the donor nucleic acidmolecule. In other instances, the donor nucleic acid molecule may begenerated by contacting the hybridized nucleic acid molecule generatedin (b) with an exonuclease that is inhibited by the one or more (e.g.,from about 1 to about 12, from about 1 to about 10, from about 1 toabout 6, from about 1 to about 4, from about 2 to about 12, from about 2to about 10, from about 2 to about 7, from about 2 to about 3, fromabout 4 to about 12, from about 8 to about 12, from about 8 to about 16,etc.) nuclease resistant group under conditions that allow for thedigestion of one or both termini of the hybridized nucleic acid moleculeuntil the exonuclease reaches the one or more nuclease resistant group,thereby generating the donor nucleic acid molecule. In some instances,two nuclease resistant groups will be present in both strands of bothtermini of donor nucleic acid molecule (see FIG. 3).

The invention also includes methods for generating donor nucleic acidmolecules containing one or more nuclease resistant group in at leastone strand (or both strands) of at least one terminus (or both termini).Such methods may comprise (a) generating two single-stranded nucleicacid molecules that share at least one region of sequencecomplementarity sufficient to allow for the two single-stranded nucleicacid molecules to hybridize to each other, wherein at least one of thetwo single-stranded nucleic acid molecules contains at least onenuclease resistant group, (b) contacting the two single-stranded nucleicacid molecules with each other under conditions that allow for the twomolecules to hybridize, to generate a hybridized nucleic acid molecule,and (c) contacting the hybridized nucleic acid molecule with anexonuclease that is inhibited by the at least one nuclease resistantgroup under condition that allow for the formation of the donor nucleicacid molecule. In some instances, the donor nucleic acid molecules maycontain at least one terminal nuclease resistant group. In certaininstances, the nuclease resistant groups include phosphorothioatelinkages.

Additionally, the invention includes methods for generating donornucleic acid molecules containing one or more nuclease resistant groupin at least one strand of at least one terminus. Such methods comprise(a) producing two single-stranded nucleic acid molecules capable ofhybridizing with each other, wherein at least one of the two nucleicacid molecules contains at least one nuclease resistant group, and (b)contacting the two single-stranded nucleic acid molecules with eachother under conditions that allow for the two molecules to hybridize,thereby generating the donor nucleic acid molecule, wherein the donornucleic acid molecule contains at least one, terminal single-strandedregion of at least ten nucleotides in length that has sequencecomplementarity to a locus in a cell, and wherein the at least one,terminal single-stranded region contains at least one nuclease resistantgroup.

In some aspects, the invention includes composition comprising partiallydouble-stranded donor nucleic acid molecules comprising two regions, aswell as methods for making and using such nucleic acid molecules.Further, the two regions comprising (a) a single-stranded region atleast ten nucleotides in length and (b) a double-stranded region atleast twenty base pairs in length, wherein the single-stranded regionhas sequence complementarity to a locus in a cell and at least onenuclease resistant group located on the non-overhanging strand withintwo nucleotides of the beginning of the double-stranded region. In someaspect, such compositions will further comprise a transfection reagent.Further, the partially double-stranded donor nucleic acid molecule maycomprise at least one nuclease resistant group which forms aphosphorothioate linkage. In some instances, the last twointernucleosidic linkages are phosphorothioate linkages. Also, the donornucleic acid molecule may have one or more 5′ overhangs or 3′ overhangs.Additionally, the partially double-stranded donor nucleic acid moleculemay have single-stranded regions at both termini.

In additional aspects, the invention includes methods for performinghomologous recombination in a population of cells, the method comprising(a) contacting the population of cells with a nucleic acid cuttingentity under conditions that allow for the generation of double-strandedbreak at a target locus in nucleic acid present inside cells of thepopulation, to produce cells containing an intracellular cleaved nucleicacid molecule, and (b) introducing a donor nucleic acid molecule intocells generated in step (a) under conditions that allow for homologousrecombination to occur, wherein homologous recombination occurs at thetarget locus in at least 20% of the cells of the population. In relatedaspects, the target locus and/or the donor nucleic acid molecule haveone or more of the following characteristics (a) the target locus andthe donor nucleic acid molecule share at least one matched terminus, (b)the donor nucleic acid molecule contains one or more nuclease resistantgroup, (c) donor nucleic acid molecule has asymmetric termini, (d) thetarget locus cut site is within 15 nucleotides of the location wherealteration is desired, (e) the nucleic acid cutting entity, orcomponents thereof, and the donor nucleic acid molecule are contactedwith the cells of the population at different times, and/or (f) theamount of the donor nucleic acid molecule contacted with cells of thepopulation is in a range that allows for efficient uptake and homologousrecombination. Nucleic acid cutting entities that may be employed insuch methods comprises one or more zinc finger-FokI fusion proteincomplex, one or more TAL nuclease, one or more CRISPR complex, or one ormore argonaute-nucleic acid complex. Further, the donor nucleic acidmolecule may have asymmetric termini of different lengths. In someembodiments, the asymmetric termini of different lengths may comprisesingle-stranded regions of different lengths. Single-stranded regionsused in the practice of the invention may be less than 100 (e.g., fromabout 10 to about 95, from about 20 to about 95, from about 30 to about95, from about 40 to about 95, from about 50 to about 95, from about 10to about 75, from about 20 to about 75, from about 25 to about 95, fromabout 25 to about 60, etc.) nucleotides in length. In some instances,the matched termini of the target locus and the donor nucleic acidmolecule are single-stranded regions that share 100% sequencecomplementarity. In related aspects, nucleic acid at the target locusmay be blunt ended and the donor nucleic acid molecule may have amatched terminus that is single-stranded. In some instances,hybridization of the matched termini of the target locus and the donornucleic acid molecule results in the formation of a junction regioncontaining nicks in both strands. In other instances, hybridization ofthe matched termini of the target locus and the donor nucleic acidmolecule results in the formation of a junction region that containsgaps of no more than two nucleotides in one or both strands. In specificembodiments, the matched termini of the target locus and the donornucleic acid molecule comprise 5′ single-stranded regions, 3′single-stranded regions, or both 5′ and 3′ single-stranded regions.

It has been found that co-delivery of all homologous recombinationcomponents, in some instances, results in decreased efficiency ofhomologous recombination. Thus, in some aspect of the invention, thecells of the population may be contacted with the nucleic acid cuttingentity, or components thereof, before the cells of the population arecontacted with the donor nucleic acid molecule. Further, the cells ofthe population are contacted with the nucleic acid cutting entity, orcomponents thereof, for between 5 and 80 (e.g., from about 5 to about60, from about 5 to about 50, from about 5 to about 45, from about 5 toabout 40, from about 5 to about 35, from about 5 to about 30, from about5 to about 25, from about 10 to about 50, from about 10 to about 40,from about 10 to about 30, from about 15 to about 40, etc.) minutesbefore the cells of the population are contacted with the donor nucleicacid molecule. In related, as well as other, aspect of the invention,the donor nucleic acid molecules may contain one or more nucleaseresistant group at one or more terminus. Further, the donor nucleic acidmolecules may contain two nuclease resistant groups at one or moreterminus. In some aspects, the donor nucleic acid molecule may containtwo nuclease resistant groups at each terminus. In additional aspects,the donor nucleic acid molecule may contain two nuclease resistantgroups in each strand at each terminus. Further, the one or morenuclease resistant group may be phosphorothioate groups. In someaspects, the target locus cut site may be within 10 nucleotides of thelocation where alteration is desired. Further, the target locus cut sitemay comprise single stranded region that includes all or part of thelocation where alteration is desired. In addition, the single-strandedregion contains a single mismatched nucleotide between the target locusand the donor nucleic acid molecule.

It has also been found that adjustment of the amount of donor nucleicacid affects the efficiency of homologous recombination. In someembodiments of the invention, the amount of donor nucleic acid may bebetween 50 and 900 ng (e.g., from about 50 to about 800, from about 50to about 700, from about 50 to about 600, from about 50 to about 500,from about 50 to about 400, from about 50 to about 300, from about 150to about 800, from about 150 to about 650, from about 150 to about 550,from about 150 to about 450, from about 200 to about 600, etc.) per1×10⁵ cells (e.g., animal cells, plant cells, insect cells, mammaliancells, human cells, rodent cells, etc.). Further, donor nucleic acidmolecules may be introduced into cells of the population by any numberof means, including electroporation or transfection.

BRIEF DESCRIPTION OF THE DRAWINGS

For a more complete understanding of the principles disclosed herein,and the advantages thereof, reference is made to the followingdescriptions taken in conjunction with the accompanying drawings, inwhich:

FIG. 1 is a schematic showing a nicking based nucleic acid cleavagestrategy using a nick based cleavage system (e.g., a two nick siteCRISPR system). In the top portion of the figure, two lines representdouble-stranded nucleic acid. Two nick sites are indicated by Nick Site1 and Nick Site 2. The center portion of the figure shows the result ofnicking actions the two closely positioned nicks on different strands.The result in this instance is a double-stranded break, resulting in theformation of two thirty nucleotide 5′ overhangs. The lower portion ofthis figure shows a nucleic acid segment with 5′ termini that sharesequence complementarity with the break site.

FIG. 2 is a schematic showing a nicking based nucleic acid cleavagestrategy using a nick based cleavage system (e.g., a two nick siteCRISPR system). As in FIG. 1, in the top portion of the figure, twolines represent double-stranded nucleic acid. Two closely associatednick sites are indicated by Nick Site 1 and Nick Site 2. Two additionalclosely associated nick sites are indicated by Nick Site 3 and Nick Site4. Cutting at all four nick sites results in the formation of a nucleicacid molecule having the structure shown at the center of this figure.The result in this instance are a double-stranded breaks, resulting inthe formation of a thirty nucleotide 5′ overhang at one location and athirty nucleotide 3′ overhang at the other location. The lower portionof this figure shows a nucleic acid segment with a 5′ terminus and a 3′terminus that each share sequence complementarity with the termini atthe ends of the nucleic acid molecule represented in the center of thisfigure.

FIG. 3 shows a number of different formats of nucleic acid segments thatmay be used in various embodiments of the invention. The open circles atthe termini represent nuclease resistant groups. Two circles mean thatthere are two groups. The black areas represent regions of sequencehomology/complementarity with one or more locus of another nucleic acidmolecule (e.g., chromosomal DNA). The cross hatched areas representnucleic acid located between regions of sequencehomology/complementarity in nucleic acid segments. This figure showsfive different variations of nucleic acid segments that may be sued indifferent aspects of the invention.

FIG. 4 is a representation of a nucleic acid segment hybridized todifferent termini nucleic acid termini. The nucleic acid molecule on theleft side has a 3′ overhang. The nucleic acid molecule on the right sideis blunt ended. The nucleic acid segment (shown in the middle) has a 5′overhang that share sequence complementarity with the 3′ overhang of thenucleic acid molecule on the left side.

FIG. 5 is a representation similar to that shown in FIG. 4 with theexception that the end of the matched terminus of cleaved nucleicmolecule is double-stranded and the matched terminus of the donornucleic acid molecule is single-stranded. The black region representscomplementary nucleic acid regions.

FIG. 6 is a schematic of a guide RNA molecule (104 nucleotides) showingthe guide RNA bound to both Cas9 protein and a target genomic locus.Hairpin Region 1 is formed by the hybridization of complementary crRNAand tracrRNA regions joined by the nucleotides GAAA. Hairpin Region 2 isformed by a complementary region in the 3′ portion of the tracrRNA. FIG.6 discloses SEQ ID NOs: 1-3, respectively, from top to bottom.

FIGS. 7A-7E. Sequential delivery of Cas9 RNP and donor DNA facilitatedHDR. (FIG. 7A) Definition of PAM and non-PAM ssDNA donor. The PAM ssDNAdonor is defined as the strand containing the NGG PAM sequence whereasthe non-PAM ssDNA donor is defined as the strand complementary to thePAM strand. (FIG. 7B) PAM or non-PAM ssDNA (6 nt insertion). Cas9 RNP(Cas9 nuclease and the +5 gRNA) and a 97-mer PAM or non-PAM ssDNAoligonucleotide were co-delivered (RD) or sequentially delivered viaelectroporation to disrupted EmGFP stable cell lines with Cas9 RNP firstand then donor (R->D) or donor first and then Cas9 RNP (D->R). A briefcell washing step was involved for sequential delivery. Two consecutiveelectroporations without wash of cells served as control (RD×2). Thepercentages of EmGFP-positive cells were determined by flow cytometry at48 hours post transfection. (FIG. 7C) PAM ssDNA dose (6 nt insertion).Cas9 RNP and various amount of PAM ssDNA oligonucleotide weresequentially delivered to disrupted EmGFP stable cell lines viaelectroporation. 0.33 μg of a ssDNA oligonucleotide per 10 μl reactionwas equivalent to approximately 1 μM final concentration. Samples in theabsence of donor (+gRNA) or gRNA (−gRNA) were used as controls. (FIG.7D) dsDNA donor (6 nt insert). Cas9 RNP and a 400 bp dsDNA donor wereco-delivered (RD) or sequentially delivered (R->D) to disrupted EmGFPstable cell lines. Samples in the absence of donor (+gRNA) or gRNA(−gRNA) served as controls. (FIG. 7E) PAM ssDNA (1 nt substitution).Cas9 RNP (Cas9 nuclease and the eBFP gRNA) and a 100-mer PAM ssDNAoligonucleotide were co-delivered (RD) or sequentially delivered viaelectroporation to HEK293 cells stably expressing eBFP with Cas9 RNPfirst and then donor (R->D) or donor first and then Cas9 RNP (D->R).Samples in the absence of donor (+gRNA) or gRNA (−gRNA) served ascontrols. The percentages of GFP-positive cells were determined by flowcytometry at 48 hours post transfection.

FIGS. 8A-8D. Effects of oligonucleotide length and modification on HDR.(FIG. 8A) PAM ssDNA (6 nt insertion). Cas9 RNP (Cas9 nuclease and the +5gRNA) and various length of PAM ssDNA oligonucleotide with (PS) orwithout phosphorothioate modification were sequentially delivered todisrupted EmGFP stable cell lines via electroporation. Various length ofPAM ssDNA oligonucleotide was normalized to either equal mass or equalmolarity. Samples in the absence of donor (+gRNA) or gRNA (−gRNA) servedas controls. The percentages of GFP-positive cells were determined byflow cytometry at 48 hours post transfection. (FIG. 8B) PAM ssDNA (1 ntsubstitution). Cas9 RNP and various length of PAM ssDNA oligonucleotidewith (PS) or without phosphorothioate modification were sequentiallydelivered to HEK293 cells expressing eBFP. The percentages ofGFP-positive cells were determined by flow cytometry at 48 hours posttransfection. (FIG. 8C) Verification by sequencing. The eBFP genomiclocus was PCR-amplified, followed by cloning and Sanger sequencing of 96samples. The relative percentage of wild type (wt), NHEJ, and HDR wasplotted. (FIG. 8D) Examples of mutations. Examples of edited clones (notrepresenting the actual percentages of NHEJ and HDR). FIG. 8D disclosesSEQ ID NOs: 4-14, respectively, from top to bottom.

FIGS. 9A-9C. DSB in close proximity to insertion site enhanced HDR.(FIG. 9A) Available gRNAs flanking the insertion site. A series of gRNAswere designed and synthesized flanking the insertion site (↓) targetingeither the top strand (▾) or bottom strand (▴). The number and ±signsindicate the position of DSB upstream (−) or downstream (+) of theinsertion site (0). (FIG. 9B) gRNA cleavage efficiency. A series ofgRNAs were associated with Cas9 nuclease separately and the resultingCas9 RNPs were transfected into disrupted EmGFP stable cell lines. Thepercentages of Indel were evaluated at 48 hours post transfection. (FIG.9C) dsDNA or ssDNA donors. A series of Cas9 RNPs along with a 400 bpdsDNA donor or a 97-base PAM ssDNA donor were sequentially delivered todisrupted EmGFP stable cell lines. Samples in the absence of donor(+gRNA) or gRNA (−gRNA) served as controls. The percentages ofGFP-positive cells were determined by flow cytometry at 48 hours posttransfection.

FIGS. 10A-10C. Asymmetric ssDNA donors enhanced HDR. (FIG. 10A)Asymmetric PAM or Non-PAM strand ssDNA annealing. Two separate gRNAsflanking the insertion site (↓ with a 0 above) were designed andsynthesized with double-stranded breaks (DSB) occurred at position −3and +5 separately (▴). Upon end recession of DSB, the 3′ recessive endswere generated in two opposite orientations, which could anneal toeither PAM (a) or non-PAM (b) ssDNA donors. The PAM ssDNAoligonucleotide is defined as the strand containing the NGG PAMsequence. (PAM ssDNA donor is defined as the PAM-containing strand)(FIG. 10B) Asymmetrical donor design. A series of ssDNA donors weredesigned with various number of nucleotides on the left arm (−) andright arm (+) of the insertion site. Both the PAM and non-PAM strandswere tested. The Cas9 RNP (1.5 μg Cas9 nuclease, 360 ng gRNA) and ssDNAdonors (10 pmol) were sequentially delivered to disrupted EmGFP stableHEK293 cell lines. At 48 hours post transfection, the % Indel wasdetermined by the GCD assay (FIG. 9B), whereas the percentages ofEmGFP-positive cells were determined by flow cytometry. The bar graphs((FIG. 10C)—Normalized HDR efficiencies) represented the normalized HDRefficiency (% EmGFP+cells/of % Indel) with averages of three individualexperiments.

FIGS. 11A-11E. Insertion of a FLAG tag along with an EcoRI site usingdsDNA donor with single-stranded overhangs. (FIG. 11A) Various donor DNAmolecules containing a 30-base FLAG tag along with an EcoRI site weredesigned and synthesized, including single-stranded DNA donor (ssDNA),blunt-end dsDNA donor (blunt), dsDNA donor with 5′ overhang (5′), dsDNAdonor with 3′ overhang (3′). The length of overhangs varied from 6nucleotides (6), 15 nucleotides (15) to 30 nucleotides (30). The 3′ and5′ ends of the oligonucleotides harbored two consecutivephosphorothioate-modified bases (Table 4). The short dsDNA donors withand without overhangs were prepared by annealing two short DNAoligonucleotides. The Cas9 RNP targeting the eBFP gene and various formsof DNA donors were sequentially delivered to HEK293 cells expressingeBFP. At 48 hours post transfection, the eBFP locus was PCR-amplified.The resulting PCR fragments were analyzed by the genomic cleavage anddetection assays to determine the percentage of Indel or subjected torestriction digestion with EcoRI to determine the percentage ofdigestion. (FIG. 11B) Length of 3′ overhang. The dsDNA donors with 15,24, 30, 36, or 45-base 3′ overhang were sequentially delivered with Cas9RNP to HEK293 cells expressing eBFP. Alternatively, a dsDNA donor with30-base 3′ overhang but without phosphorothioate modification (30-3′n)was used. The percentage of digestion with EcoRI was determined at 48hours post transfection. (FIG. 11C) Dose effect. Cas9 RNP and variousamount of ssDNA donor or dsDNA donor with 30-base 3′ overhangs weresequentially delivered to HEK293 cells expressing eBFP. The eBFP lociwere PCR-amplified. The resulting PCR fragments were analyzed by EcoRIdigest. (FIG. 11D) Sequencing verification. The PCR fragments werecloned into E. coli and 192 clones were randomly picked for sequencing.The relative percentage of wild type (wt), NHEJ, and HDR clones derivedfrom either ssDNA donor (ssDNA) or dsDNA with 3′ overhangs (3′ overhang)was plotted. The white rectangles represented the population of clonesthat contained the insert but with a point mutation. Examples of editedclones were shown in (FIG. 11E) (not representing the actual percentagesof NHEJ and HDR). The underlined sequences represented the FLAG tagalong with an RI site. FIG. 11E discloses SEQ ID NOs: 15-24,respectively, from top to bottom.

FIGS. 12A-12D. Various DSB repair pathways. (FIG. 12A) DNA repairthrough NHEJ pathway. (FIG. 12B) DNA repair by either PAM or non-PAMssDNA oligonucleotide. (FIG. 12C) DNA repair by dsDNA donor. (FIG. 12D)DNA repair by dsDNA donor with 3′ single-stranded overhangs.

FIGS. 13A-13B. Generation of stable cell lines for HDR assays. (FIG.13A) A disrupted EmGFP HEK293 stable cell line containing deletion of“CACCTT” (SEQ ID NO: 25) was generated by transfecting cells with Cas9RNPs, followed by limiting dilution and clonal isolation. HDR assayswere carried out by transfecting disrupted EmGFP HEK293 cells with Cas9RNP and donor DNA, followed by flow cytometric analysis at 48 hours posttransfection. Transfections without either donor DNA (Cas9 RNP) or gRNA(Cas9/donor) were used as controls. Fluorescence was only seen in theCas9 RNP/Donor treated cells (data not shown). (FIG. 13B) A stableHEK293FT cell expressing eBFP gene was generated using Lentiviraldelivery system. A point mutation from “C” to “T” would convert His66 toTyr66, resulting in generation of a variant of GFP. HDR assays wereperformed by transfecting eBFP-expressing HEK293 cells with Cas9 RNP anddonor DNA, followed by flow cytometric analysis to determine thepercentage of GFP-positive cells at 48 hours post transfection.Transfections without either donor DNA (Cas9 RNP) or gRNA (Cas9/donor)served as controls. Green fluorescence was only seen in the Cas9RNP/Donor treated cells (data not shown). FIGS. 13A-13B disclose SEQ IDNOs: 26-33, respectively, from top to bottom.

FIG. 14A. Both asymmetric PAM and non-PAM ssDNA donors facilitate HDR.Three separate gRNAs flanking the insertion site (↓ with a 0 above) weredesigned (top of figure) and synthesized with double-stranded breaks(DSB) occurred at position −3, +3 and +5 separately. PAM strand isdefined as the NGG-containing strand. The +3 gRNA's PAM is on the top 5′to 3′ strand (▾), whereas the -3 and +5 gRNAs have PAMs on the bottom 3′to 5′ strand (▴). A series of ssDNA donors (lower left of figure) weredesigned with various number of nucleotides on the left arm (−) andright arm (+) of the insertion site. Both the PAM and non-PAM strandswere used. The Cas9 RNP (1.5 μg Cas9 nuclease, 360 ng of the +3 gRNA)and ssDNA donors (10 pmol) were sequentially delivered to disruptedEmGFP stable HEK293 cell lines (lower right of figure). At 48 hours posttransfection, the % Indel was determined by the Genomic Cleavage andDetection assay, whereas the percentages of EmGFP-positive cells weredetermined by flow cytometry.

FIG. 14B. This figure us similar to FIG. 14A except that the −3 gRNA(center) or +5 gRNA (right) was used.

DETAILED DESCRIPTION

Definitions:

As used herein the term “homologous recombination” refers to a mechanismof genetic recombination in which two DNA strands comprising similarnucleotide sequences exchange genetic material. Cells use homologousrecombination during meiosis, where it serves to rearrange DNA to createan entirely unique set of haploid chromosomes, but also for the repairof damaged DNA, in particular for the repair of double strand breaks.The mechanism of homologous recombination is well known to the skilledperson and has been described, for example by Paques and Haber (PaquesF, Haber J E.; Microbiol. Mol. Biol. Rev. 63:349-404 (1999)). In themethod of the present invention, homologous recombination is enabled bythe presence of said first and said second flanking element being placedupstream (5′) and downstream (3′), respectively, of said donor DNAsequence each of which being homologous to a continuous DNA sequencewithin said target sequence.

As used herein the term “non-homologous end joining” (NEHJ) refers tocellular processes that join the two ends of double-strand breaks (DSBs)through a process largely independent of homology. Naturally occurringDSBs are generated spontaneously during DNA synthesis when thereplication fork encounters a damaged template and during certainspecialized cellular processes, including V(D)J recombination,class-switch recombination at the immunoglobulin heavy chain (IgH) locusand meiosis. In addition, exposure of cells to ionizing radiation(X-rays and gamma rays), UV light, topoisomerase poisons or radiomimeticdrugs can produce DSBs. NHEJ (non-homologous end-joining) pathways jointhe two ends of a DSB through a process largely independent of homology.Depending on the specific sequences and chemical modifications generatedat the DSB, NHEJ may be precise or mutagenic (Lieber M R., The mechanismof double-strand DNA break repair by the nonhomologous DNA end-joiningpathway. Annu Rev Biochem 79:181-211).

As used herein the term “donor DNA” or “donor nucleic acid” refers tonucleic acid that is designed to be introduced into a locus byhomologous recombination. Donor nucleic acid will have at least oneregion of sequence homology to the locus. In many instances, donornucleic acid will have two regions of sequence homology to the locus.These regions of homology may be at one of both termini or may beinternal to the donor nucleic acid. In many instances, an “insert”region with nucleic acid that one desires to be introduced into anucleic acid molecules present in a cell will be located between tworegions of homology (see FIG. 2).

As used herein the term “homologous recombination system or “HR system”refers components of systems set out herein that maybe used to altercells by homologous recombination. In particular, zinc finger nucleases,TAL effector nucleases, CRISPR endonucleases, homing endonucleases, andargonaute editing systems.

As used herein the term “nucleic acid cutting entity” refers to a singlemolecule or a complex of molecules that has nucleic acid cuttingactivity (e.g., double-stranded nucleic acid cutting activity).Exemplary nucleic acid cutting entities include zinc finger proteins,transcription activator-like effectors (TALEs), CRISPR complexes, andhoming meganucleases. In many instances, nucleic acid cutting entitieswill have an activity that allows them to be nuclear localized (e.g.,will contain nuclear localization signals (NLS)).

As used herein the term “zinc finger protein (ZFP)” refers to a proteincomprising refers to a polypeptide having nucleic acid (e.g., DNA)binding domains that are stabilized by zinc. The individual DNA bindingdomains are typically referred to as “fingers,” such that a zinc fingerprotein or polypeptide has at least one finger, more typically twofingers, or three fingers, or even four or five fingers, to at least sixor more fingers. In some aspect, ZFPs will contain three or four zincfingers. Each finger typically binds from two to four base pairs of DNA.Each finger usually comprises an about 30 amino acids zinc-chelating,DNA-binding region (see, e.g., U.S. Pat. Publ. No. 2012/0329067 A1, thedisclosure of which is incorporated herein by reference).

As used herein the term “transcription activator-like effectors (TAL)”refers to proteins composed of more than one TAL repeat and is capableof binding to nucleic acid in a sequence specific manner. In manyinstances, TAL effectors will contain at least six (e.g., at least 8, atleast 10, at least 12, at least 15, at least 17, from about 6 to about25, from about 6 to about 35, from about 8 to about 25, from about 10 toabout 25, from about 12 to about 25, from about 8 to about 22, fromabout 10 to about 22, from about 12 to about 22, from about 6 to about20, from about 8 to about 20, from about 10 to about 22, from about 12to about 20, from about 6 to about 18, from about 10 to about 18, fromabout 12 to about 18, etc.) TAL repeats. In some instances, a TALeffector may contain 18 or 24 or 17.5 or 23.5 TAL nucleic acid bindingcassettes. In additional instances, a TAL effector may contain 15.5,16.5, 18.5, 19.5, 20.5, 21.5, 22.5 or 24.5 TAL nucleic acid bindingcassettes. TAL effectors will generally have at least one polypeptideregion which flanks the region containing the TAL repeats. In manyinstances, flanking regions will be present at both the amino andcarboxyl termini of the TAL repeats. Exemplary TALs are set out in U.S.Pat. Publ. No. 2013/0274129 Al and may be modified forms on naturallyoccurring proteins found in bacteria of the genera Burkholderia,Xanthamonas and Ralstonia.

In many instances, TAL proteins will contain nuclear localizationsignals (NLS) that allow them to be transported to the nucleus.

As used herein the term “CRISPR complex” refers to the CRISPR proteinsand nucleic acid (e.g., RNA) that associate with each other to form anaggregate that has functional activity. An example of a CRISPR complexis a wild-type Cas9 (sometimes referred to as Csn1) protein that isbound to a guide RNA specific for a target locus.

As used herein the term “CRISPR protein” refers to a protein comprisinga nucleic acid (e.g., RNA) binding domain nucleic acid and an effectordomain (e.g., Cas9, such as Streptococcus pyogenes Cas9). The nucleicacid binding domains interact with a first nucleic acid molecules eitherhaving a region capable of hybridizing to a desired target nucleic acid(e.g., a guide RNA) or allows for the association with a second nucleicacid having a region capable of hybridizing to the desired targetnucleic acid (e.g., a crRNA). CRISPR proteins can also comprise nucleasedomains (i.e., DNase or RNase domains), additional DNA binding domains,helicase domains, protein-protein interaction domains, dimerizationdomains, as well as other domains.

CRISPR protein also refers to proteins that form a complex that bindsthe first nucleic acid molecule referred to above. Thus, one CRISPRprotein may bind to, for example, a guide RNA and another protein mayhave endonuclease activity. These are all considered to be CRISPRproteins because they function as part of a complex that performs thesame functions as a single protein such as Cas9.

In many instances, CRISPR proteins will contain nuclear localizationsignals (NLS) that allow them to be transported to the nucleus.

As used herein, the term “target locus” refers to a site within anucleic acid molecule that is recognized and cleavage by a nucleic acidcutting entity. When, for example, a single CRISPR complex is designedto cleave double-stranded nucleic acid, then the target locus is the cutsite and the surrounding region recognized by the CRISPR complex. When,for example, two CRISPR complexes are designed to nick double-strandednucleic acid in close proximity to create a double-stranded break, thenthe region surrounding recognized by both CRISPR complexes and includingthe break point is referred to as the target locus.

As used herein, the term “nuclease-resistant group” refers to a chemicalgroup that may be incorporated into nucleic acid molecules and caninhibit by enzymes (exonucleases and/or endonucleases) degradation ofnucleic acid molecules containing the group. Examples of such groups arephosphorothioate internucleotide linkages, 2′-O-methyl nucleotides,2′-deoxy-2′-fluoro nucleotides, 2′-deoxy nucleotides, and 5-C-methylnucleotides.

As used herein, the term “double-stranded break site” refers to alocation in a nucleic acid molecule where a double-stranded breakoccurs. In many instances, this will be generated by the nicking of thenucleic acid molecule at two close locations (e.g., within from about 3to about 50 base pairs, from about 5 to about 50 base pairs, from about10 to about 50 base pairs, from about 15 to about 50 base pairs, fromabout 20 to about 50 base pairs, from about 3 to about 40 base pairs,from about 5 to about 40 base pairs, from about 10 to about 40 basepairs, from about 15 to about 40 base pairs, from about 20 to about 40base pairs, etc.). Typically, nicks may be further apart in nucleic acidregions that contain higher AT content, as compared to nucleic acidregions that contain higher GC content.

As used herein, the term “matched termini” refers to termini of nucleicacid molecules that share sequence identity of greater than 90%. Amatched terminus of a DS break at a target locus may be double-strandedor single-stranded. A matched terminus of a donor nucleic acid moleculewill generally be single-stranded.

Overview:

The invention relates, in part, to compositions and methods forenhancing the efficiency of gene editing reactions via, for example,homologous recombination. The invention also related, in part, toincreasing the homologous recombination (HR) to non-homologousend-joining (NHEJ) ratio. Both of these aspects of the invention may beachieved by the delivery of donor nucleic acid to a target locus byassociating it with one or more nucleic acid cutting entities. While notwishing to be bound to theory, it is believed that both increased HRefficiency and increased HR as compared to NHEJ are the result of a highlocal concentration of donor nucleic acid at target loci that have adouble-stranded (DS) break.

In some instances, methods of the invention employ at least one donornucleic acid that has termini that is “matched” to termini of the cutsite. Examples of some embodiments of compositions and methods of theinvention are set out in FIG. 1. FIG. 1 shows two nicks sites designedto generate a double-stranded (DS) break in a DNA molecule. The DS breakhas two 5′ overhangs of 30 nucleotides each. The DS donor nucleic acidmolecules has two 5′ overhangs of 30 nucleotides each with sequencecomplementarity to the 5′ overhangs generated in the cut nucleic acidmolecule.

In the instance shown in FIG. 1, the donor nucleic acid molecule isdesigned to hybridize to both termini of the cut nucleic acid moleculein a manner that a DNA ligase would be able to repair the cut site withan introduction of an “insert” nucleic acid segment into the cut nucleicacid molecule.

FIG. 2 shows another variation of the invention where four nicks aregenerated to remove a segment of the nucleic acid molecule that is cut.Further, the cut nucleic acid molecule has a 3′ overhang at one terminusand a 5′ overhang on the other terminus. The termini of the donornucleic acid molecule are again designed to match those at the cut site.

In some aspects, the invention relates to compositions and methods forenhancing gene editing systems. Some of the features of such enhancedsystems include one or more of the following: (1) delivery of one ormore gene editing molecules (e.g., Cas9, gRNA, mRNA encoding a TALeffector, etc.) and donor nucleic acid molecules at different times, (2)the “matching” of termini between target loci and donor nucleic acidmolecules, (3) designing of termini between target loci and donornucleic acid molecules to maximize recombination efficiency, (4)adjustment of the amount of donor nucleic acid that the cells arecontacted with, (5) the amount of donor nucleic acid delivered per cell(e.g., the average number of donor nucleic acid molecule delivered percell), (6) protection of terminal regions of donor nucleic acidmolecules from nucleases, and (7) the use of donor nucleic acidmolecules with asymmetric single-stranded termini (e.g., one terminalsingle-stranded region is of a different length that the terminalsingle-stranded region).

Donor Nucleic Acid Molecules and Homologous Recombination

Donor nucleic acids will typically contain regions of homologycorresponding to nucleic acid at or near a target locus. Exemplary donornucleic acid molecules are shown in FIGS. 1-5. Using the nucleic acidmolecules set out in FIG. 3 for purposes of illustration, donor nucleicacid may be single-stranded (SS) or double-stranded (DS) and it may beblunted ended on one or both ends or it may have overhangs on one orboth ends. Further, overhangs, when present, may be 5′, 3′ or 3′ and 5′.Also, the lengths of overhangs may vary. Donor nucleic acid moleculeswill often also contain an “insert” region that may be from about onenucleotide to about several thousand nucleotides.

In one aspect of the invention it has been found that the efficiency ofhomologous recombination is enhanced when one or both termini of donornucleic acid molecules “matches” that of the DS break into which it isdesigned to be introduced into. Further, upon entry into cells (as wellas prior to cellular entry), donor nucleic acid molecules may be exposedto nucleases (e.g., endonucleases, endonucleases, etc.). In order tolimit the action of endonucleases with respect to altering donor nucleicacid molecule, one or more nuclease resistant group may be present.

FIG. 3 shows a number of variations of donor nucleic acid molecules thatmay be used in aspects of the invention. The open circles at the terminirepresent nuclease resistant groups. Such groups may be located at anumber of places in the donor nucleic acid molecules. Donor nucleic acidmolecule number 6 shows a 3′ terminal region of the lower strand that islocated past the nuclease resistant groups. In some instances, cellularnucleases will digest this portion of the donor nucleic acid molecule.These nucleases will either stop or be slowed down by the nucleaseresistant group, thereby stabilizing the structure of the terminus ofthe 3′ region of the lower strand.

The invention thus includes compositions comprising nucleic acidmolecules containing one or more (e.g., one, two, three, four, five,six, seven, etc.) nuclease resistant groups, as well as methods formaking and using such donor nucleic acid molecules. In many instances,nuclease resistant groups will be located or one or both termini ofdonor nucleic acid molecules. Donor nucleic acid molecules may containgroups interior form one or both termini. In many instances, some or allof such donor nucleic acid molecules will be processed within cells togenerate termini that match DS break sites.

The homology regions may be of varying lengths and may have varyingamounts of sequence identity with nucleic acid at the target locus.Typically, homologous recombination efficiency increases with increasedlengths and sequence identity of homology regions. The length ofhomology regions employed is often determined by factors such asfragility of large nucleic acid molecules, transfection efficiency, andease of generation of nucleic acid molecules containing homologyregions.

Homology regions may be from about 20 bases to about 10,000 bases intotal length (e.g., from about 20 bases to about 100 bases, from about30 bases to about 100 bases, from about 40 bases to about 100 bases,from about 50 bases to about 8,000 bases, from about 50 bases to about7,000 bases, from about 50 bases to about 6,000 bases, from about 50bases to about 5,000 bases, from about 50 bases to about 3,000 bases,from about 50 bases to about 2,000 bases, from about 50 bases to about1,000 bases, from about 50 bases to about 800 bases, from about 50 basesto about 600 bases, from about 50 bases to about 500 bases, from about50 bases to about 400 bases, from about 50 bases to about 300 bases,from about 50 bases to about 200 bases, from about 100 bases to about8,000 bases, from about 100 bases to about 2,000 bases, from about 100bases to about 1,000 bases, from about 100 bases to about 700 bases,from about 100 bases to about 600 bases, from about 100 bases to about400 bases, from about 100 bases to about 300 bases, from about 150 basesto about 1,000 bases, from about 150 bases to about 500 bases, fromabout 150 bases to about 400 bases, from about 200 bases to about 1,000bases, from about 200 bases to about 600 bases, from about 200 bases toabout 400 bases, from about 200 bases to about 300 bases, from about 250bases to about 2,000 bases, from about 250 bases to about 1,000 bases,from about 350 bases to about 2,000 bases, from about 350 bases to about1,000 bases, etc.).

In some instances, it may be desirable to use regions of sequencehomology that are less than 200 bases in length. This will often be thecase when the donor nucleic acid molecule contains a small insert (e.g.,less than about 300 bases) and/or when the donor nucleic acid moleculehas one or two overhanging termini that match the DS break site.

Overhanging termini may be of various lengths and may be of differentlengths at each end of the same donor nucleic acid molecules. In manyinstances, these overhangs will form the regions of sequence homology.FIG. 3, for example, shows a series of donor nucleic acid molecule thathave 30 nucleotide single-stranded overhangs. These donor nucleic acidmolecules are single-stranded and double-stranded. Donor nucleic acidmolecule number 1 in FIG. 3 is a single-stranded molecule that has 30nucleotides of sequence homology with an intended DS break site, a 30nucleotide insert, and two nuclease resistant groups at each terminus.While a donor nucleic acid molecule of this type can be used with anumber of DS break sites, it may also be sued with a DS break site ofthe type shown in FIG. 2. Thus, the invention includes compositions andmethods for the introduction of single-stranded donor nucleic acidmolecules into a target locus.

The amount of sequence identity the homologous regions share with thenucleic acid at the target locus, typically the higher the homologousrecombination efficiency. High levels of sequence identity areespecially desired when the homologous regions are fairly short (e.g.,50 bases). Typically, the amount of sequencer identity between thetarget locus and the homologous regions will be greater than 90% (e.g.,from about 90% to about 100%, from about 90% to about 99%, from about90% to about 98%, from about 95% to about 100%, from about 95% to about99%, from about 95% to about 98%, from about 97% to about 100%, etc.).

As used herein, “percentage of sequence identity” means the valuedetermined by comparing two optimally aligned nucleotide sequences overa comparison window, wherein the portion of the nucleotide sequence inthe comparison window may comprise additions or deletions (i.e.,sequence alignment gaps) as compared to the reference sequence (whichdoes not comprise additions or deletions) for optimal alignment of thetwo sequences. In other words, sequence alignment gaps are removed forquantification purposes. The percentage of sequence identity iscalculated by determining the number of positions at which the identicalnucleic acid base or amino acid residue occurs in both sequences toyield the number of matched positions, dividing the number of matchedpositions by the total number of positions in the window of comparisonand multiplying the result by 100 to yield the percentage of sequenceidentity.

One method for determining sequence identity values is through the useof the BLAST 2.0 suite of programs using default parameters (Altschul etal., Nucleic Acids Res. 25:3389-3402 (1997)). Software for performingBLAST analyses is publicly available, e.g., through the National Centerfor Biotechnology-Information.

The insert region of donor nucleic acid molecules may be of a variety oflengths, depending upon the application that it is intended for. In manyinstances, donor nucleic acid molecules will be from about 1 to about4,000 bases in length (e.g., from about 1 to 3,000, from about 1 to2,000, from about 1 to 1,500, from about 1 to 1,000, from about 2 to1,000, from about 3 to 1,000, from about 5 to 1,000, from about 10 to1,000, from about 10 to 400, from about 10 to 50, from about 15 to 65,from about 2 to 15, etc. bases).

The invention also provide compositions and methods for the introductioninto intracellular nucleic acid of a small number of bases (e.g., fromabout 1 to about 10, from about 1 to about 6, from about 1 to about 5,from about 1 to about 2, from about 2 to about 10, from about 2 to about6, from about 3 to about 8, etc.). For purposes of illustration, a donornucleic acid molecule may be prepared that is fifty-one bases pairs inlength. This donor nucleic acid molecule may have two homology regionsthat are 25 base pairs in length with the insert region being a singlebase pair. When nucleic acid surrounding the target locus essentiallymatches the regions of homology with no intervening base pairs,homologous recombination will result in the introduction of a singlebase pair at the target locus. Homologous recombination reactions suchas this can be employed, for example, to disrupt protein coding readingframes, resulting in the introduction of a frame shift in intracellularnucleic acid. The invention thus provides compositions and methods forthe introduction of one or a small number of bases into intracellularnucleic acid molecules.

The invention further provides compositions and methods for thealteration of short nucleotide sequences in intracellular nucleic acidmolecules. One example of this would be the change of a singlenucleotide position, with one example being the correction or alterationof a single-nucleotide polymorphism (SNP). Using SNP alteration forpurposes of illustration, a donor nucleic acid molecule may be designedwith two homology regions that are 25 base pairs in length. Locatedbetween these regions of homology is a single base pair that isessentially a “mismatch” for the corresponding base pair in theintracellular nucleic acid molecules. Thus, homologous recombination maybe employed to alter the SNP by changing the base pair to either onethat is considered to be wild-type or to another base (e.g., a differentSNP). Cells that have correctly undergone homologous recombination maybe identified by later sequencing of the target locus.

Donor nucleic acid may also contain elements desired for insertion(i.e., an insert) into an intracellular nucleic acid molecule (e.g., achromosome or plasmid) by homologous recombination. Such elements may beselectable markers (e.g., a positive selectable marker such as anantibiotic resistance marker), promoter elements, non-selectable markerprotein coding nucleic acid (e.g., nucleic acid encoding cytokines,growth factors, etc.). Inserts may also encode detectable proteins suchas luciferase and fluorescent proteins such as green fluorescent proteinand yellow fluorescent protein).

Compositions and methods of the invention are designed to result in highefficiency of homologous recombination in cells (e.g., eukaryotic cellssuch as plant cells and animal cells, such as insect cells mammaliancells, including mouse, rat, hamster, rabbit and human cells). In someinstances, homologous recombination efficiency is such that greater than20% of cells in a population will have underdone homologousrecombination at the desired target locus or loci. In some instances,homologous recombination may occur within from about 10% to about 65%,from about 15% to about 65%, from about 20% to about 65%, from about 30%to about 65%, from about 35% to about 65%, from about 10% to about 55%,from about 20% to about 55%, from about 30% to about 55%, from about 35%to about 55%, from about 40% to about 55%, from about 10% to about 45%,from about 20% to about 45%, from about 30% to about 45%, from about 40%to about 45%, from about 30% to about 50%, etc. of cell in a population.

Further, the invention includes compositions and methods for increasingthe efficiency of homologous recombination within cells. For example, ifhomologous recombination occurs in 10% of a cell population under oneset of conditions and in 40% of a cell population under another set ofconditions, then the efficiency of homologous recombination hasincreased by 300%. In some aspects of the invention, the efficiency ofhomologous recombination may increase by from about 100% to about 500%(e.g., from about 100% to about 450%, from about 100% to about 400%,from about 100% to about 350%, from about 100% to about 300%, from about200% to about 500%, from about 200% to about 400%, from about 250% toabout 500%, from about 250% to about 400%, from about 250% to about350%, from about 300% to about 500%, etc.).

One example of a set of conditions for which the efficiency ofhomologous recombination may be measured is where two identical donornucleic acid molecules are used, where one has unmodified termini andthe other has two phosphorothioate groups on each strand of eachterminus. It has been found that such nuclease resistant groups can beused to increase the efficiency of homologous recombination. Further,such donor nucleic acid molecules may have termini that match the DSbreak site in at the target locus. Regardless of the various parametersused for the homologous recombination reactions, the invention includescompositions and methods for increasing the efficiency of homologousrecombination.

One homologous recombination assay that may be used in the practice ofthe invention is set out in the examples and employs the incorporationinto a nucleic acid molecule by homologous recombination a restrictionsite. Other assays involve nucleotide sequencing. Numerous other methodsare known in the art.

In many instances, target loci will be cleaved in a manner that willresult in blunt termini. In many instances, blunt ended matched terminiwill be contacted with donor nucleic acid molecules havingsingle-stranded matched termini. In such instances, it has been foundthat single nucleotides at target loci can be replaced with nucleotidesin donor nucleic acid molecules, when the target loci nucleotides arenear the DS break (e.g., within 10 nucleotides of termini).

While not wishing to be bound by theory, it is thought that the above isdue to 5′ strand resection, followed by favoring of the terminus ofdonor nucleic acid molecules in the repair process. Further, the closerto the DS break (up to about 10 nucleotides), the higher the probabilitythat the target locus base will be replaced with a donor nucleic acidmolecule base during the repair process. Thus, the invention includescompositions and methods for the introduction of single-base changes ata target locus, the method comprising generating a DS break (e.g., ablunt ended break) at the target locus, followed by contacting the breakpoint with a donor nucleic acid molecule having a single basesubstitution in the cognate matching terminus. In most instances, thesingle base to be substituted will be positioned within 1, 2, 3, 4, 5,or 6 bases of the terminus of the target locus.

Nucleic Acid Cutting Entities

The invention relates, in part, to gene editing resulting from theinteraction of donor nucleic acid molecules with target loci. A numberof mechanisms and/or gene editing systems may be used to generate DSbreaks at target loci. The mechanism used to generate DS breaks attarget loci will typically be selected based upon a number of factorssuch as efficiency of DS break generation at target loci, the ability togenerate DS break generation at suitable locations at or near targetloci, low potential for DS break generation at undesired loci, lowtoxicity, and cost issues. A number of these factors will vary with thecell employed and target loci.

A number of gene editing systems that may be used in the practice of theinvention are known in the art. These include zinc finger nucleases, TALeffector nucleases, CRISPR endonucleases, homing endonucleases, andargonaute editing systems.

In most instances, nucleic acid cutting entity components will be eitherproteins or nucleic acids or a combination of the two but they may beassociated with cofactors and/or other molecules.

A. Zinc Finger Based Systems

Zinc-finger nucleases (ZFNs) and meganucleases are examples of genomeengineering tools that can be used to generate DS breaks in the practiceof the invention. ZFNs are chimeric proteins consisting of a zinc-fingerDNA-binding domain and a nuclease domain. One example of a nucleasedomain is the non-specific cleavage domain from the type IIS restrictionendonuclease FokI (Kim, Y G; Cha, J., Chandrasegaran, S. Hybridrestriction enzymes: zinc finger fusions to Fok I cleavage domain Proc.Natl. Acad. Sci. USA. 1996 Feb. 6;93(3):1156-60) typically separated bya linker sequence of 5-7 base pairs. A pair of the FokI cleavage domainis generally required to allow for dimerization of the domain andcleavage of a non-palindromic target sequence from opposite strands. TheDNA-binding domains of individual Cys2His2 ZFNs typically containbetween 3 and 6 individual zinc-finger repeats and can each recognizebetween 9 and 18 base pairs.

One problem associated with ZNFs is the possibility of off-targetcleavage which may lead to random integration of donor DNA or result inchromosomal rearrangements or even cell death which still raises concernabout applicability in higher organisms (Zinc-finger Nuclease-inducedGene Repair With Oligodeoxynucleotides: Wanted and Unwanted Target LocusModifications Molecular Therapy vol. 18 no. 4, 743-753 (2010)).

B. TAL Effectors Based Systems

Transcription activator-like (TAL) effectors represent a class of DNAbinding proteins secreted by plant-pathogenic bacteria of the species,such as Xanthomonas and Ralstonia, via their type III secretion systemupon infection of plant cells. Natural TAL effectors specifically havebeen shown to bind to plant promoter sequences thereby modulating geneexpression and activating effector-specific host genes to facilitatebacterial propagation (Römer, P., et al., Plant pathogen recognitionmediated by promoter activation of the pepper Bs3 resistance gene.Science 318, 645-648 (2007); Boch, J. & Bonas, U. Xanthomonas AvrBs3family-type III effectors: discovery and function. Annu. Rev.Phytopathol. 48, 419-436 (2010); Kay, S., et al. U. A bacterial effectoracts as a plant transcription factor and induces a cell size regulator.Science 318, 648-651 (2007); Kay, S. & Bonas, U. How Xanthomonas typeIII effectors manipulate the host plant. Curr. Opin. Microbiol. 12,37-43 (2009)).

Natural TAL effectors are generally characterized by a central repeatdomain and a carboxyl-terminal nuclear localization signal sequence(NLS) and a transcriptional activation domain (AD). The central repeatdomain typically consists of a variable amount of between 1.5 and 33.5amino acid repeats that are usually 33-35 residues in length except fora generally shorter carboxyl-terminal repeat referred to as half-repeat.The repeats are mostly identical but differ in certain hypervariableresidues. DNA recognition specificity of TAL effectors is mediated byhypervariable residues typically at positions 12 and 13 of eachrepeat—the so-called repeat variable diresidue (RVD) wherein each RVDtargets a specific nucleotide in a given DNA sequence. Thus, thesequential order of repeats in a TAL protein tends to correlate with adefined linear order of nucleotides in a given DNA sequence. Theunderlying RVD code of some naturally occurring TAL effectors has beenidentified, allowing prediction of the sequential repeat order requiredto bind to a given DNA sequence (Boch, J. et al. Breaking the code ofDNA binding specificity of TAL-type III effectors. Science 326,1509-1512 (2009); Moscou, M. J. & Bogdanove, A. J. A simple ciphergoverns DNA recognition by TAL effectors. Science 326, 1501 (2009)).Further, TAL effectors generated with new repeat combinations have beenshown to bind to target sequences predicted by this code. It has beenshown that the target DNA sequence generally start with a 5′ thyminebase to be recognized by the TAL protein.

The modular structure of TALs allows for combination of the DNA bindingdomain with effector molecules such as nucleases. In particular, TALeffector nucleases allow for the development of new genome engineeringtools known.

TAL effectors used in the practice of the invention may generate DSbreaks or may have a combined action for the generation of DS breaks.For example, TAL-FokI nuclease fusions can be designed to bind at ornear a target locus and form double-stranded nucleic acid cuttingactivity by the association of two FokI domains.

C. CRISPR Based Systems

Gene altering reagents may be based upon CRISPR systems. The term“CRISPR” is a general term that applies to three types of systems, andsystem sub-types. In general, the term CRISPR refers to the repetitiveregions that encode CRISPR system components (e.g., encoded crRNAs).Three types of CRISPR systems (see Table 1) have been identified, eachwith differing features.

TABLE 1 CRISPR System Types Overview System Features Examples Type IMultiple proteins (5-7 proteins typical), Staphylococcus crRNA, requiresPAM. DNA Cleavage epidermidis is catalyzed by Cas3. (Type IA) Type II3-4 proteins (one protein (Cas9) has Streptococcus nuclease activity)two RNAs, requires pyogenes CRISPR/ PAMs. Target DNA cleavage catalyzedCas9, Francisella by Cas9 and RNA components. novicida U112 Cpf1 TypeIII Five or six proteins required for cutting, S. epidermidis number ofrequired RNAs unknown but (Type IIIA); expected to be 1, PAMs notrequired. P. furiosus Type IIIB systems have the ability to (Type IIIB).target RNA.

While the invention has numerous aspects and variations associated withit, the Type II CRISPR/Cas9 system has been chosen as a point ofreference for explanation herein.

In certain aspects, the invention provides stabilized crRNAs, tracrRNAs,and/or guide RNAs (gRNAs), as well as collections of such RNA molecules.

FIG. 6 shows components and molecular interactions associated with aType II CRISPR system. In this instance, the Cas9 mediated Streptococcuspyogenes system is exemplified. A gRNA is shown in FIG. 6 hybridizing toboth target DNA (Hybridization Region 1) and tracrRNA (HybridizationRegion 2). In this system, these two RNA molecules serve to bring theCas9 protein to the target DNA sequence is a manner that allows forcutting of the target DNA. The target DNA is cut at two sites, to form adouble-stranded break.

CRISPRs used in the practice of the invention may generate DS breaks ormay have a combined action for the generation of DS breaks. For example,mutations may be introduced into CRISPR components that prevent CRISPRcomplexes from making DS breaks but still allow for these complexes tonick DNA. Mutations have been identified in Cas9 proteins that allow forthe preparation of Cas9 proteins that nick DNA rather than makingdouble-stranded cuts. Thus, the invention includes the use of Cas9proteins that have mutations in RuvC and/or HNH domains that limit thenuclease activity of this protein to nicking activity.

CRISPR systems that may be used in the practice of the invention varygreatly. These systems will generally have the functional activities ofa being able to form complex comprising a protein and a first nucleicacid where the complex recognizes a second nucleic acid. CRISPR systemscan be a type I, a type II, or a type III system. Non- limiting examplesof suitable CRISPR proteins include Cas3, Cas4, Cas5, Cas5e (or CasD),Cas6, Cas6e, Cas6f, Cas7, Cas8a1, Cas8a2, Cas8b, Cas8c, Cas9, Cas10,CasI Od, CasF, CasG, CasH, Csy1, Csy2, Csy3, Cse1 (or CasA), Cse2 (orCasB), Cse3 (or CasE), Cse4 (or CasC), Csc1, Csc2, Csa5, Csn2, Csm2,Csm3, Csm4, Csm5, Csm6, Cmr1, Cmr3, Cmr4, Cmr5, Cmr6, Csb1, Csb2, Csb3,Csx17, Csx14, Csx10, Csx16, CsaX, Csx3, Csz1, Csx15, Csf1, Csf2, Csf3,Csf4, and Cu1966.

In some embodiments, the CRISPR protein (e.g., Cas9) is derived from atype II CRISPR system. In specific embodiments, the CRISPR system isdesigned to acts as an oligonucleotide (e.g., DNA or RNA)—guidedendonuclease derived from a Cas9 protein. The Cas9 protein for this andother functions set out herein can be from Streptococcus pyogenes,Streptococcus thermophilus, Streptococcus sp., Nocardiopsisdassonvillei, Streptomyces pristinaespiralis, Streptomycesviridochromogenes, Streptomyces viridochromogenes, Streptosporangiumroseum, Streptosporangium roseum, AlicyclobacHlus acidocaldarius,Bacillus pseudomycoides, Bacillus selenitireducens, Exiguobacteriumsibiricum, Lactobacillus delbrueckii, Lactobacillus salivarius,Microscilla marina, Burkholderiales bacterium, Polaromonasnaphthalenivorans, Polaromonas sp., Crocosphaera watsonii, Cyanothecesp., Microcystis aeruginosa, Synechococcus sp., Acetohalobiumarabaticum, Ammonifex degensii, Caldicelulosiruptor becscii, CandidatusDesulforudis, Clostridium botulinum, Clostridium difficile, Finegoldiamagna, Natranaerobius thermophilus, Pelotomaculumthermopropionicum,Acidithiobacillus caldus, Acidithiobacillus ferrooxidans, Allochromatiumvinosum, Marinobacter sp., Nitrosococcus halophilus, Nitrosococcuswatsoni, Pseudoalteromonas haloplanktis, Ktedonobacter racemifer,Methanohalobium evestigatum, Anabaena variabilis, Nodularia spumigena,Nostoc sp., Arthrospira maxima, Arthrospira platensis, Arthrospira sp.,Lyngbya sp., Microcoleus chthonoplastes, Oscillatoria sp., Petrotogamobilis, Thermosipho africanus, or Acaryochloris marina.

D. Argonaute Gene Editing Systems

The argonaute family of proteins are endonucleases that use 5′phosphorylated single-stranded nucleic acids as guides to cleave nucleicacid targets. These proteins, like Cas9, are believed to have roles ingene expression repression and defense against exogenous nucleic acids.

Argonaute proteins differ from Cas9 in a number of ways. Unlike Cas9,which exist only in prokaryotes, argonaute proteins are evolutionarilyconserved and are present in almost all organisms. Some argonauteproteins have been found to bind single-stranded DNAs and cleave targetDNA molecules. Further, no specific consensus secondary structure ofguides is required for argonaute binding and no sequence like a CRISPRsystem PAM site is required. It has been shown that the argonauteprotein of Natronobacterium gregoryi can be programmed withsingle-stranded DNA guides and used as a genome editing in mammaliancells (Gao et al., Nature Biotech., May 2, 2016; doi:10.1038/nbt.3547).

Argonaute proteins require a 5′ phosphorylated single-stranded guide DNAmolecule that is about 24 nucleotides in length. The amino acid sequenceof an argonaute that may be used in the practice of the invention is setout in Table 2.

TABLE 2 Natronobacterium gregoryi Argonaute Amino Acid Sequence(SEQ ID NO: 34) 1MTVIDLDSTT TADELTSGHT YDISVTLTGV YDNTDEQHPR MSLAFEQDNG ERRYITLWKN 61TTPKDVFTYD YATGSTYIFT NIDYEVKDGY ENLTATYQTT VENATAQEVG TTDEDETFAG 121GEPLDHHLDD ALNETPDDAE TESDSGHVMT SFASRDQLPE WTLHTYTLTA TDGAKTDTEY 181ARRTLAYTVR QELYTDHDAA PVATDGLMLL TPEPLGETPL DLDCGVRVEA DETRTLDYTT 241AKDRLLAREL VEEGLKRSLW DDYLVRGIDE VLSKEPVLTC DEFDLHERYD LSVEVGHSGR 301AYLHINFRHR FVPKLTLADI DDDNIYPGLR VKTTYRPRRG HIVWGLRDEC ATDSLNTLGN 361QSVVAYHRNN QTPINTDLLD AIEAADRRVV ETRRQGHGDD AVSFPQELLA VEPNTHQIKQ 421FASDGFHQQA RSKTRLSASR CSEKAQAFAE RLDPVRLNGS TVEFSSEFFT GNNEQQLRLL 481YENGESVLTF RDGARGAHPD ETFSKGIVNP PESFEVAVVL PEQQADTCKA QWDTMADLLN 541QAGAPPTRSE TVQYDAFSSP ESISLNVAGA IDPSEVDAAF VVLPPDQEGF ADLASPTETY 601DELKKALANM GIYSQMAYFD RFRDAKIFYT RNVALGLLAA AGGVAFTTEH AMPGDADMFI 661GIDVSRSYPE DGASGQINIA ATATAVYKDG TILGHSSTRP QLGEKLQSTD VRDIMKNAIL 721GYQQVTGESP THIVIHRDGF MNEDLDPATE FLNEQGVEYD IVEIRKQPQT RLLAVSDVQY 781DTPVKSIAAI NQNEPRATVA TFGAPEYLAT RDGGGLPRPI QIERVAGETD IETLTRQVYL 841LSQSHIQVHN STARLPITTA YADQASTHAT KGYLVQTGAF ESNVGFL

Introduction of Materials into Cells:

The invention also includes compositions and methods for introduction ofgene editing system components and/or donor nucleic acid molecules intocells. Introduction of a various molecules into cells may be done in anumber of ways including by methods described in many standardlaboratory manuals, such as Davis et al., BASIC METHODS IN MOLECULARBIOLOGY, (1986) and Sambrook et al., MOLECULAR CLONING: A LABORATORYMANUAL, 2nd Ed., Cold Spring Harbour Laboratory Press, Cold SpringHarbor. N.Y. (1989), such as, calcium phosphate transfection,DEAE-dextran mediated transfection, transfection, microinjection,cationic lipid-mediated transfection, electroporation, transduction,scrape loading, ballistic introduction, nucleoporation, hydrodynamicshock, and infection.

The invention includes methods in which different components of nucleicacid cutting entities and/or donor nucleic acid molecules are introducedinto cells by different means, as well as compositions of matter forperforming such methods. For example, a lentiviral vector may be used tointroduce nucleic acid encoding Cas9 operably linked to a suitablepromoter and guide RNA may be introduced by transfection. Further, donornucleic acid may be associated with the guide RNA. Also, Cas9 mRNA maybe transcribed from a chromosomally integrated nucleic acid molecule,resulting in either constitutive or regulatable production of thisprotein.

In many instances, a single type of nucleic acid cutting entity moleculemay be introduced into a cell but some nucleic acid cutting entitymolecules may be expressed within the cell. One example of this is wheretwo zinc finger-FokI fusions are used to generate a double-strandedbreak in intracellular nucleic acid. In some instance, only one of thezinc finger-FokI fusions may be introduced into the cell and the otherzinc finger-FokI fusion may be produced intracellularly.

Transfection agents suitable for use with the invention includetransfection agents that facilitate the introduction of RNA, DNA andproteins into cells. Exemplary transfection reagents include TurboFectTransfection Reagent (Thermo Fisher Scientific), Pro-Ject Reagent(Thermo Fisher Scientific), TRANSPASS™ P Protein Transfection Reagent(New England Biolabs), CHARIOT™ Protein Delivery Reagent (Active Motif),PROTEOJUICE™ Protein Transfection Reagent (EMD Millipore), 293fectin,LIPOFECTAMINE™ 2000, LIPOFECTAMINE™ 3000 (Thermo Fisher Scientific),LIPOFECTAMINE™ (Thermo Fisher Scientific), LIPOFECTIN™ (Thermo FisherScientific), DMRIE-C, CELLFECTIN™ (Thermo Fisher Scientific),OLIGOFECTAMINE™ (Thermo Fisher Scientific), LIPOFECTACE™, FUGENE™(Roche, Basel, Switzerland), FUGENE™ HD (Roche), TRANSFECTAM™(Transfectam, Promega, Madison, Wis.), TFx-10™ (Promega), TFX-20™(Promega), TFx-50™ (Promega), TRANSFECTIN™ (BioRad, Hercules, Calif.),SILENTFECT™ (Bio-Rad), Effectene™ (Qiagen, Valencia, Calif.), DC-chol(Avanti Polar Lipids), GENEPORTER™ (Gene Therapy Systems, San Diego,Calif.), DHARMAFECT 1™ (Dharmacon, Lafayette, Colo.), DHARMAFECT 2™(Dharmacon), DHARMAFECT 3™ (Dharmacon), DHARMAFECT 4™ (Dharmacon),ESCORT™ III (Sigma, St. Louis, Mo.), and ESCORT™ IV (Sigma ChemicalCo.).

The invention further includes methods in which one molecule isintroduced into a cell, followed by the introduction of another moleculeinto the cell. Thus, more than one nucleic acid cutting entity componentmay be introduced into a cell at the same time or at different times. Asan example, the invention includes methods in which Cas9 is introducedinto a cell while the cell is in contact with a transfection reagentdesigned to facilitate the introduction of proteins in to cells (e.g.,TurboFect Transfection Reagent), followed by washing of the cells andthen introduction of guide RNA while the cell is in contact withLIPOFECTAMINE™ 2000.

In some specific instances, Cas9-RNA complexes may be introduced intocells at one time point and donor nucleic acid molecules may beintroduced at a different time point. It has been shown that geneediting efficiency increases when two gene editing reagents such asthese are introduced into cells at separate time points. Further,Cas9-RNA complexes may be introduced first, followed by donor nucleicacid molecules being introduced later. Also, donor nucleic acidmolecules may be introduced first, followed by Cas9-RNA complexes beingintroduced later. The time between introduction of the different geneediting reagents into cells may be between 1 minute and 600 minutes(e.g., 1 minute and 500 minutes, 1 minute and 400 minutes, 1 minute and300 minutes, 1 minute and 200 minutes, 1 minute and 100 minutes, 1minute and 50 minutes, 1 minute and 30 minutes, 1 minute and 20 minutes,1 minute and 10 minutes, 5 minutes and 500 minutes, 5 minutes and 200minutes, 5 minutes and 100 minutes, 5 minutes and 50 minutes, 5 minutesand 30 minutes, 10 minutes and 100 minutes, 10 minute and 200 minutes,10 minutes and 50 minutes, 15 minutes and 100 minutes, etc.).

Conditions will normally be adjusted on, for example, a per cell typebasis for a desired level of nucleic acid cutting entity componentintroduction into the cells. While enhanced conditions will vary,enhancement can be measure by detection of intracellular nucleic acidcutting activity. Thus, the invention includes compositions and methodsfor measurement of the intracellular introduction of nucleic acidcutting activity within cells.

With respect to CRISPRs, the invention also includes compositions andmethods related to the formation and introduction of CRISPR complexesinto cells.

A number of compositions and methods may be used to form CRISPRcomplexes. For example, cas9 mRNA and a guide RNA may be encapsulated inINVIVOFECTAMINE™ for, for example, later in vivo and in vitro deliveryas follows. mRNA cas9 is mixed (e.g., at a concentration of at 0.6mg/ml) with guide RNA. The resulting mRNA/gRNA solution may be used asis or after addition of a diluents and then mixed with an equal volumeof INVIVOFECTAMINE™ and incubated at 50° C. for 30 min. The mixture isthen dialyzed using a 50 kDa molecular weight curt off for 2 hours in 1XPBS, pH7.4. The resulting dialyzed sample containing the formulatedmRNA/gRNA is diluted to the desire concentration and applied directly oncells in vitro or inject tail vein or intraperitoneal for in vivodelivery. The formulated mRNA/gRNA is stable and can be stored at 4° C.

For Cas9 mRNA transfection of cultured cells, such as 293 cells, 0.5 μgmRNA was added to 25 μl of Opti-MEM, followed by addition of 50-100 nggRNA. Meanwhile, two μl of LIPOFECTAMINE™ 3000 or RNAiMax was dilutedinto 25 μl of Opti-MEM and then mixed with mRNA/gRNA sample. The mixturewas incubated for 15 minutes prior to addition to the cells.

A CRISPR system activity may comprise expression of a reporter (e.g.,green fluorescent protein, β-lactamase, luciferase, etc.) or nucleicacid cleavage activity. Using nucleic acid cleavage activity forpurposes of illustration, total nucleic acid can be isolated from cellsto be tested for CRISPR system activity and then analyzed for the amountof nucleic acid that has been cut at the target locus. If the cell isdiploid and both alleles contain target loci, then the data will oftenreflect two cut sites per cell. CRISPR systems can be designed to cutmultiple target sites (e.g., two, three four, five, etc.) in a haploidtarget cell genome. Such methods can be used to, in effect, “amplify”the data for enhancement of CRISPR system component introduction intocells (e.g., specific cell types). Conditions may be enhanced such thatgreater than 50% of the total target loci in cells exposed to CRISPRsystem components (e.g., one or more of the following: Cas9 protein,Cas9 mRNA, crRNA, tracrRNA, guide RNA, complexed Cas9/guide RNA, etc.)are cleaved. In many instances, conditions may be adjusted so thatgreater than 60% (e.g., greater than 70%, greater than 80%, greater than85%, greater than 90%, greater than 95%, from about 50% to about 99%,from about 60% to about 99%, from about 65% to about 99%, from about 70%to about 99%, from about 75% to about 99%, from about 80% to about 99%,from about 85% to about 99%, from about 90% to about 99%, from about 95%to about 99%, etc.) of the total target loci are cleaved.

EXAMPLES Example 1: Enhanced CRISPR/Cas9-mediated Precise Genome Editingby Improved Design and Delivery of gRNA, Cas9 Nuclease, and Donor DNA

Abstract

While CRISPR-based gene knock out in mammalian cells has proven to bevery efficient, precise insertion of genetic elements through homologydirected repair (HDR) remains a rate-limiting step to seamless genomeediting. Under the conditions described here, we achieved approximately60% targeted integration efficiency with up to a six-nucleotideinsertion in HEK293 cells. Finally, the use of a short double stranded(ds)DNA oligonucleotide with 3′ overhangs allowed integration of alonger FLAG epitope tag along with a restriction site into multiple lociat rates of up to 50%.

These data suggest that after cleavage, the Cas9 complex dissociatesfrom the cleavage site, or is dislodged sufficiently, allowing access torelatively short (˜30 nt) 3′ overhangs on either side of the break withcomparable efficiency. This is likely due to 5′ end resection via theDNA repair machinery. This model favors the design of donor DNAs withthe insertion or SNP repair element as close to the cleavage site aspossible and 3′ protruding single strand homology arms of approximately30 bases for larger donor molecules. For smaller single stranded donormolecules, 30 base arms 3′ to the insertion/repair cassette and greaterthan 40 bases on the 5′ end seems to be favored.

Introduction

The recent advances in CRISPR-mediated genome engineering enableresearchers to efficiently introduce double-strand breaks (DSBs) ingenomic DNA (Cho, S. W., Kim, S., Kim, J. M., Kim, J. S., 2013, Targetedgenome engineering in human cells with the Cas9 RNA-guided endonuclease,Nat. Biotechnol. 31:230-232; Jiang, W., Bikard, D., Cox, D., Zhang, F.,Marraffini, L. A., RNA-guided editing of bacterial genomes usingCRISPR-Cas systems, Nat. Biotechnol. 31:233-239 (2013); Liang, X.,Potter, J., Kumar, S., Zou, Y., Quintanilla, R., Sridharan, M., Carte,J., Chen, W., Roark, N., Ranganathan, S., Ravinder, N., Chesnut, J. D.,Rapid and highly efficient mammalian cell engineering via Cas9 proteintransfection, J. Biotechnol. 208:44-53 (2015); Mali, P., Yang, L.,Esvelt, K. M., Aach, J., Guell, M., DiCarlo, J. E., Norville, J. E.,Church, G. M., RNA-guided human genome engineering via Cas9, Science339(6121):823-826 (2013); Wang, H., Yang, H., Shivalila, C. S., Dawlaty,M. M., Cheng, A. W., Zhang, F., Jaenisch, R., One-step generation ofmice carrying mutations in multiple genes by CRISPR/Cas-mediated genomeengineering, Cell 153:910-918 (2013)). The DSBs are then mostly repairedby either the non-homologous end joining (NHEJ) pathway or thehomology-directed repair (HDR) pathway. In mammalian cells, the NHEJpathway is predominant and error-prone, which results in disruptiveinsertions or deletions (indels) at targeted loci allowing for theefficient creation of gene knockouts. Alternatively, the cells mayutilize sister chromatids or an exogenous DNA template to repair the DNAdamage via HDR, but the efficiency is relatively low. For example, theuse of a Cas9 nickase produced HDR frequencies of 6% in HEK293FT cellswith a single-stranded DNA oligonucleotide (ssDNA) (Ran, F. A., Hsu, P.D., Lin, C. Y., Gootenberg, J. S., Konermann, S., Trevino, A. E., Scott,D. A., Inoue, A., Matoba, S., Zhang, Y., Zhang, F., Double nicking byRNA-guided CRISPR Cas9 for enhanced genome editing specificity, Cell154(6):1380-1389 (2013)) or 5% in human embryonic stem cells (hESCs)with a long DNA donor template containing a puromycin selection cassette(Rong, Z., Zhu, S., Xu, Y., Fu, X., Homologous recombination in humanembryonic stem cells using CRISPR/Cas9 nickase and a long DNA donortemplate, Protein Cell 5(4):258-260 (2014)). The synchronization ofcells at M phase with nocodazole prior to nucleofection resulted in upto 38% and 1.6% HDR in HEK293T cells and hESCs respectively, which werehigher than the controls of 26% and essentially ˜0% in un-synchronizedHEK293T cells and hESCs respectively (Lin, S., Staahl, B. T., Alla, R.K., Doudna, J. A., Enhanced homology-directed human genome engineeringby controlled timing of CRISPR/Cas9 delivery, Elife 3:e04766 (2014)).The co-delivery of gRNA with a ssDNA donor into Cas9-expressing humanpluripotent stem cells (hPSCs) generated homozygous knock-in clones at arate of up to 10% (González, F., Zhu, Z., Shi, Z. D., Lelli, K., Verma,N., Li, Q. V., Huangfu, D., An iCRISPR platform for rapid,multiplexable, and inducible genome editing in human pluripotent stemcells. Cell Stem Cell 15(2):215-226 (2014)). The delivery of Cas9ribonucleoproteins (RNPs) into primary T cells via electroporationcaused up to 40% of cells to lose high-level cell-surface expression ofCXCR4 and generated genomic knock-in modifications with up to 20%efficiency (Schumann, K., Lin, S., Boyer, E., Simeonov, D. R.,Subramaniam, M., Gate, R. E., Haliburton, G. E., Ye, C. J., Bluestone,J. A., Doudna, J. A., Marson, A., Generation of knock-in primary human Tcells using Cas9 ribonucleoproteins, Proc. Natl. Acad. Sci. USA112(33):10437-10442 (2015)). Recently, several attempts have been madeto improve HDR efficiency by biochemically altering the HDR or NHEJpathways. For example, the treatment of cells with Scr7, a DNA ligase IVinhibitor, resulted in up to 19-fold increase in HDR efficiency(Maruyama, T., Dougan, S. K., Truttmann, M. C., Bilate, A. M., Ingram,J. R., Ploegh, H. L., Increasing the efficiency of precise genomeediting with CRISPR-Cas9 by inhibition of nonhomologous end joining,Nat. Biotechnol. 33(5):538-420 (2015)). The simultaneous suppression ofboth KU70 and DNA ligase IV with siRNAs improved the efficiency of HDR4-5 fold (Chu, V. T., Weber, T., Wefers, B., Wurst, W., Sander, S.,Rajewsky, K., Kühn, R., Nat. Biotechnol. 33(5):543-548 (2015)). The HDRenhancer RS-1 increased the knock-in efficiency in rabbit embryos bothin vitro and in vivo by 2-5 fold (Song, J., Yang, D., Xu, J., Zhu, T.,Chen, Y. E., Zhang, J., RS-1 enhances CRISPR/Cas9- and TALEN-mediatedknock-in efficiency, Nat. Commun. 7:10548. doi: 10.1038/ncomms10548(2016)). Most recently, the use of asymmetric ssDNA donors of optimallength increased the rate of HDR in human cells up to 60% for a singlenucleotide substitution (Richardson, C. D., Ray, G. J., DeWitt, M. A.,Curie, G. L., Corn, J. E., 2016. Enhancing homology-directed genomeediting by catalytically active and inactive CRISPR-Cas9 usingasymmetric donor DNA, Nat. Biotechnol. 34:339-344 (2016)). In thisstudy, we examined alternative approaches to improve HDR withoutimpairment of other cellular DNA repair machinery. By optimizing thedesign and delivery of gRNA, Cas9 nuclease and donor DNA, we achievedapproximately 40% precise genome editing efficiencies in multiplegenomic loci of various cell lines. The vicinity of the DSB to targetlocus, asymmetric sense or antisense ssDNA, and electroporationconditions determined the overall integration efficiency. Furthermore,the alternate design of a short dsDNA oligonucleotide with 3′ overhangsimproved the insertion efficiency of epitope tags into the genome.

Materials and Methods

Materials

GENEART™ PLATINUM™ Cas9 Nuclease, GENEART™ CRISPR gRNA Design Tool,GENEART™ Precision gRNA Synthesis Kit, GRIPTITE™ HEK293 cells, DMEMmedium, Fetal Bovine Serum (FBS), TRYPLE™ Express Enzyme, JUMP-IN™GRIPTITE™ HEK293 Kit, Lentivirus expressing Cas9 nuclease andblasticidin marker, 2% E-GEL® EX Agarose Gels, VIRAPOWER® kit,TranscriptAid T7 High Yield Transcription Kit, MEGACLEAR™ TranscriptionClean-Up Kit, ZERO BLUNT® TOPO® PCR Cloning Kit, PURELINK® Pro Quick96Plasmid Purification Kit, QUBIT® RNA BR Assay Kit, NEON® TransfectionSystem 10 μL Kit, and Phusion Flash High-Fidelity PCR Master Mix werefrom Thermo Fisher Scientific. Monoclonal Cas9 antibody was purchasedfrom Diagenode. The DNA oligonucleotides used for gRNA synthesis ordonors were from Thermo Fisher Scientific (Table 4).

Synthesis of gRNA

DNA oligonucleotides used for gRNA synthesis were designed by GeneArt™CRISPR gRNA Design Tool. The gRNAs were then synthesized using theGeneArt™ Precision gRNA Synthesis Kit. The concentration of gRNA wasdetermined by Qubit® RNA BR Assay Kit.

Genomic cleavage and Detection (GCD) Assay

The genomic cleavage efficiency was measured by GENEART® GenomicCleavage Detection kit according to manufacturer's instructions. Theprimer sequences for PCR amplification of each genomic locus aredescribed in Table 4. Cells were analyzed at 48 to 72 hours posttransfection. The cleavage efficiencies were calculated based on therelative agarose gel band intensity, which were quantified using anALPHAIMAGER® gel documentation system running ALPHA VIEW®, Version3.4.0.0. ProteinSimple (San Jose, Calif., USA).

Generation of Stable Cell Lines

The JUMP-IN™ system was used to prepare GRIPTITE™ HEK293 stable cellexpressing EmGFP (Thermo Fisher Scientific). The gene sequence of EmGFPis described in Table 5. To create a disrupted EmGFP mutant stable cellline, a gRNA targeting the 5′-ctcgtgaccaccttcacctacgg-3′ sequence inEmGFP gene was synthesized. The resulting gRNA (300 ng) was incubatedwith 1.5 μg of GENEART® PLATINIUM™ Cas9 nuclease and used to transfectwild type EmGFP cells via electroporation. The single cell clonalisolation was carried out by limiting dilution. The EmGFP loci fromnon-glowing cells were amplified by PCR using a forward primer5′-atggtgagcaagggcgaggagctg-3′ and a reverse primer5′-gtcctccttgaagtcgatgccc-3′ and the resulting PCR products weresubjected to TOPO cloning and sequencing. From these clones, a disruptedEmGFP stable cell line containing a deletion of 5′-CACCTT-3′ wasidentified (Table 5). In the homologous recombination assay, the gain ofEmGFP function was determined by flow cytometric analysis.

To generate a HEK293FT stable cell line expressing eBFP, an eBFP ORF wassynthesized by GENEART® custom DNA synthesis (Thermo Fisher Scientific)and then cloned into pDONAR221 vector. Using GATEWAY® recombinationtechnology (Thermo Fisher Scientific), the eBFP ORF was transferred topLenti6.2-DEST Gateway Vector and then verified by sequencing.Lentivirus was generated using VIRAPOWER® kit as described in themanual. To generate stable cell line, HEK293FT cells were transducedwith 0.1 MOI of Lentivirus expressing eBFP. Three days post transductioncells were selected on 5 μg/ mL Blasticidine antibiotics for 2 weeks.Cells expressing eBFP were then collected and diluted to 0.8 cells/mL incomplete medium and plated into 96 well plates for single cell clone.After 2 weeks clones were isolated and verified for eBFP expression byflow cytometer. In a reporter assay using a stable eBFP-expressingHEK293 cell line, the substitution of C to T in eBFP gene converts His67to Tyr67, generating a GFP variant (Table 5).

Homologous Recombination Assays

To create homologous recombination (HR) assays, a series of gRNAsflanking the insertion site within the EmGFP gene were designed andsynthesized (Table 4). Each individual gRNA was combined with GENEART™PLATINUM™ Cas9 Nuclease to form the Cas9 protein/gRNA ribonucleoproteincomplexes (Cas9 RNPs). The Cas9 RNPs were then used to transfect cellsvia NEON® electroporation. The genomic cleavage efficiency was thenevaluated using the GENEART® Genomic Cleavage Detection kit at 48 hourspost transfection. The gRNAs with highest editing efficiencies and alsoin close proximity to the insertion site were selected for thesubsequent HR assays. For donor design of a single-strandedoligonucleotide, typically the mutation site was positioned at thecenter flanked by 30 to 50 nucleotides on each side. For asymmetricdonor design, 30 nucleotides were placed on either the left or right armand then 50 or 67 nucleotides on the right or left arm respectively, inboth PAM and non-PAM strands. For design of a dsDNA oligonucleotide withsingle-stranded overhangs, the insertion element was overlapped in thecenter flanked by various lengths of ssDNA oligonucleotide on each side.By annealing two single-stranded oligonucleotides at 95° C. for 3minutes, dsDNA donor molecules with either a 5′ protrusion or a 3′protrusion were generated. To measure homologous recombinationefficiency, the donor DNA was either co-transfected with Cas9 RNPs ordelivered sequentially into cells via electroporation. At 48 hours posttransfection, the gain of EmGFP function in reporter cell lines wasdetermined by flow cytometric analysis with an Attune® NxT AcousticFocusing Cytometer (Thermo Fisher Scientific). Alternatively, thegenomic loci were PCR-amplified using the corresponding primers and thensubjected to GENEART® Genomic Cleavage Detection assay or restrictiondigestion. The resulting PCR products were also subjected to TOPOcloning. Typically, 96 colonies were randomly picked for sequencing. Thesequencing data were analyzed using VECTOR NTI ADVANCE® 11.5 software(Thermo Fisher Scientific).

Electroporation

Typically, 1×10⁵ GRIPTITE™ HEK293 cells were used per electroporationusing Neon® Transfection System 10 μL Kit (Thermo Fisher Scientific). Tooptimize the electroporation conditions, the preprogrammed NEON® 24-welloptimization protocol was tested according to the manufacturer'sinstructions. To make up a master mix of 24 reactions, 8 μl of 3 mg/mlGENEART™ Platinum™ Cas9 Nuclease was added to 240 μl of ResuspensionBuffer R provided in the kit, followed by addition of 4.8 μg of gRNA.Upon mixing, the sample was incubated at room temperature for 10 minutesto form Cas9 RNP complexes. Meanwhile, 2.4×10⁶ cells were transferred toa sterile Eppendorf tube and centrifuged at 1000× g for 5 minutes. Thesupernatant was carefully aspirated and the cell pellet was washed oncewith 1 ml of DPBS without Ca2+ and Mg2+. Upon centrifugation, thesupernatant was carefully aspirated. Resuspension Buffer R containingthe Cas9 RNPs was used to resuspend the cell pellets. A 10 μl cellsuspension was used for each of the preprogrammed NEON® 24-welloptimization protocols. The electroporated cells were transferred to 24or 48-well plates containing 0.5 ml of the corresponding growth mediumand then incubated for 48 hours in a 5% CO2 incubator. The cells werewashed with DPBS and then lysed in lysis buffer, followed by genomiccleavage and detection assay as described above. Upon optimization ofelectroporation conditions, higher doses of Cas9 protein (1.5 to 2 μg)and gRNA (300 to 500 ng) were used to improve the genome editingefficiency.

For each homologous recombination assay, 1.5 μg of Cas9 protein and 360ng of gRNA were added to Resuspension Buffer R to a final volume of 7μl, but limiting the total volume of Cas9 protein plus gRNA to less than1 μl. The gRNA could be diluted in Buffer R if the concentration was toohigh. Upon mixing, the sample was incubated at room temperature for 5 to10 minutes to form Cas9 RNPs. Meanwhile, GRIPTITE™ HEK293 cellsexpressing either eBFP or disrupted EmGFP were detached from cultureflask with TRYPLE™ Express Enzyme and then counted. Aliquots of 1×106cells were washed once with DPBS without Ca²⁺ and Mg²⁺ and the cellpellets were resuspended in 50 μl of Resuspension Buffer R. A 5 μlaliquot of cell suspension was mixed with 7 μl of Cas9 RNPs. Forsequential delivery of Cas9 RNPs and DNA donor, 10 μl of cell suspensioncontaining Cas9 RNP was applied to electroporation with voltage set at1150V, pulse width set at 20 ms, and the number of pulses set at 2,respectively. The electroporated cells were transferred to 300 μl ofResuspension Buffer R or DPBS. Upon centrifugation at 2000× g for 5minutes, the supernatant was carefully aspirated and the cell pellet wasresuspended in Buffer R to a final volume of 11 μl, followed by additionof 1 μl of 10 pmol/μl or 0.3 μg/μl ssDNA donor. Alternatively, 1 μl of10 pmol/μl short dsDNA donor with and without single-stranded overhangswas added. An aliquot of 10 μl cell suspension containing donor DNA wasused for electroporation using the same instrument settings. Uponelectroporation, the cells were transferred to a 48-well platecontaining 0.5 ml culture media. For sequential delivery, the viabilityof HEK293 cells was around 50%. For co-transfection of Cas9 RNPs withdonor DNA, 0.5 μl of 20 pmol/μl or 0.6 μg/μl ssDNA donor was directlyadded to the 12 μl of cell suspension containing Cas9 RNPs.Alternatively, 0.5 μl of 20 pmol/μl short dsDNA donor with and withoutsingle-stranded overhangs was added. An aliquot of 10 μl of cellsuspension containing Cas9 RNPs and donor DNA was used forelectroporation. Samples without either gRNA or donor DNA served ascontrols. In addition to ssDNA donor, a 400 bp double-stranded DNAfragment was also tested, which was amplified from the wild type EmGFPgene using a pair of forward 5′-atggtgagcaagggcgaggagctg-3′ (SEQ ID NO:35) and reverse 5′-gtcctccttgaagtcgatgccc-3′ (SEQ ID NO: 36)primers. Foreach assay, 300 to 500 ng dsDNA was used. At 48 hours post transfection,the cells were analyzed by flow cytometry. Alternatively, the genomicloci were PCR-amplified with the corresponding primers. The resultingPCR fragments were analyzed using the GENEART® Genomic CleavageDetection assay or restriction digestion. The PCR fragments were alsosubjected to cloning and sequencing.

Optimization of Delivery of Cas9 RNP and Donor DNA

To measure HDR efficiency, we engineered a GRIPTITE™ HEK293 stable cellline and a HEK293FT stable cell line expressing EmGFP and eBFPrespectively (FIGS. 13A-13B). Cas9 protein/gRNA complexes (Cas9 RNPs)were subsequently used to target the fluorogenic region of EmGFP togenerate a disrupted EmGFP stable cell line containing a deletion of sixnucleotides (FIG. 13A). The deletion of Thr63 and Phe64 residuesresulted in ablation of EmGFP activity, which could be restored byintroducing an exogenous wild type donor DNA molecule. When thedisrupted EmGFP stable cells were transfected with Cas9 RNP and ssDNAdonor, a significant number of EmGFP-positive cells were observed.Conversely, in the absence of ssDNA donor (Cas9 RNP alone) or gRNA(Cas9/donor), almost no EmGFP-expressing cells were detected. Formeasurement of homologous recombination activity using theeBFP-expressing HEK293 stable cells, a single nucleotide transition of“C” to “T” converts a His to a Tyr at residue 66, resulting in theconversion of eBFP into closely related GFP (FIG. 13B). WheneBFP-expressing cells were transfected with Cas9 RNP plus ssDNA donor, asignificant number of GFP-positive cells were detected. As expected,predominantly eBFP-positive cells, but very few GFP-positive cells weredetected in the absence of gRNA. A time-lapse video for HDR was recordedevery 2 hours for a total of 72 hours (data not shown).

After validating our HDR assay systems, we optimized the delivery ofCas9 RNP and donor DNA as described. As shown in Table 10, a majority ofNEON® optimization programs worked well for delivery of Cas9 RNP intoHEK293 cells. A program with the voltage set at 1150V, pulse width setat 20 ms, and 2 pulses was used for the subsequent study. Initially, weco-delivered Cas9 RNP with a 97 base single-stranded PAM or non-PAMoligonucleotide into HEK293 cells. The PAM ssDNA oligonucleotide donorwas defined as the strand containing the PAM (NGG) sequence (FIG. 7A).Based on flow cytometric analysis, we observed approximately 5% and 6%EmGFP-positive cells using the PAM or non-PAM ssDNA oligonucleotides,respectively (FIG. 7B). Since the program used for Cas9 RNP deliverymight not apply to the delivery of donor DNA, we tested the sequentialdelivery of Cas9 RNP and donor DNA. The Cas9 RNP was first deliveredinto HEK293 cells via electroporation. The electroporated cells werethen washed once with Resuspension Buffer R. The cell pellets wereresuspended in Buffer R containing ssDNA or dsDNA donor. The cellsuspension was then electroporated using the NEON® 24-well optimizationprotocol (see Table 3).

TABLE 3 Electronoration Protocols Protocol Pst 2 3 4 5 6 7 8 PulseVoltage 1150 1400 1500 1600 1700 1100 1200 1300 Pulse Width 20 20 20 2020 30 30 30 # of Pulse 2 1 1 1 1 1 1 1 Protocol 9 10 11 12 13 14 15 16Pulse Voltage 1400 1000 1100 1200 1100 1200 1300 1400 Pulse Width 30 4040 40 20 20 20 20 # of Pulse 1 1 1 1 2 2 2 2 Protocol 17 18 19 20 21 2223 24 Pulse Voltage 850 950 1050 1150 1300 1400 1500 1600 Pulse Width 3030 30 30 10 10 10 10 # of Pulse 2 2 2 2 3 3 3 3

TABLE 4 DNA OligonucleotidesOligonucleotides for six nucleotide insertion in disrupted SEQEmGFP stable cell line (FIGS. 7, 8, and 9) ID Non-PAMp-Oftctgcaccaccggcaagctgcccgtgccctggcccaccctcgtgaccaccttcacct 35 oligoacggcgtgcagtgcttcgcccgctaccccgaccacFZg PAM oligop-OFTGTGGTCGGGGTAGCGGGCGAAGCACTGCACGCCGT 36 (97-mer)AGGTGAAGGTGGTCACGAGGGTGGGCCAGGGCACGGGC AGCTTGCCGGTGGTGCAGFZG 97caTGTGGTCGGGGTAGCGGGCGAAGCACTGCACGCCGT 37AGGTGAAGGTGGTCACGAGGGTGGGCCAGGGCACGGG CAGCTTGCCGGTGGTGCAGatG PS79p-OFTGTGGTCGGGGTAGCGGGCGAAGCACTGCACGCCG 38TAGGTGAAGGTGGTCACGAGGGTGGGCCAGGGCACGGG CFEC 79CATGTGGTCGGGGTAGCGGGCGAAGCACTGCACGCCG 39TAGGTGAAGGTGGTCACGAGGGTGGGCCAGGGCACGG GCAGC PS60p-OEGGGTAGCGGGCGAAGCACTGCACGCCGTAGGTGAA 40 GGTGGTCACGAGGGTGGGCCFEG 60CGGGGTAGCGGGCGAAGCACTGCACGCCGTAGGTGAA 41 GGTGGTCACGAGGGTGGGCCAGG PS40p-EECGAAGCACTGCACGCCGTAGGTGAAGGTGGTCAC 42 GFEG 40GGCGAAGCACTGCACGCCGTAGGTGAAGGTGGTCACG 43 AGGOligonucleotides for single point mutation in HEK293 SEQcells expressing BFP (FIGS. 7D and 8B) ID 100ACCGGCAAGCTGCCCGTGCCCTGGCCCACCCTCG 44 TGACCACCCTGACCTACGGCGTGCAGTGCTTCAGCCGCTACCCCGACCACATGAAGCAGCACGACT PS100P-FOCGGCAAGCTGCCCGTGCCCTGGCCCACCCTC 45TGACCACCCTGACCTACGGCGTGCAGTGCTTCAG CCGCTACCCCGACCACATGAAGCAGCACGFOT 90GCAAGCTGCCCGTGCCCTGGCCCACCCTCGTGAC 46 CACCCTGACCTACGGCGTGCAGTGCTTCAGCCGCTACCCCGACCACATGAAGCAGC PS90 P-EOAAGCTGCCCGTGCCCTGGCCCACCCTCGTGA 47CCACCCTGACCTACGGCGTGCAGTGCTTCAGCCG CTACCCCGACCACATGAAGCFEC 80GCTGCCCGTGCCCTGGCCCACCCTCGTGACCACCC 48TGACCTACGGCGTGCAGTGCTTCAGCCGCTACCCC GACCACATGA PS80P-EOTGCCCGTGCCCTGGCCCACCCTCGTGACCACC 49CTGACCTACGGCGTGCAGTGCTTCAGCCGCTACCC CGACCACAZEA 70CCCGTGCCCTGGCCCACCCTCGTGACCACCCTGACC 50TACGGCGTGCAGTGCTTCAGCCGCTACCCCGACC PS70P-OOCGTGCCCTGGCCCACCCTCGTGACCACCCTGA 51CCTACGGCGTGCAGTGCTTCAGCCGCTACCCCGFOC 60GCCCTGGCCCACCCTCGTGACCACCCTGACCTACGG 52 CGTGCAGTGCTTCAGCCGCTACCC PS60P-EOCCTGGCCCACCCTCGTGACCACCCTGACCTAC 53 GGCGTGCAGTGCTTCAGCCGCTAOOC 50CCCACCCTCGTGACCACCCTGACCTACGGCGTGCA 54 GTGCTTCAGCCGCTA PS50P-OOCACCCTCGTGACCACCCTGACCTACGGCGTGC 55 AGTGCTTCAGCCGOZA SEQOligonucleotides for single point mutation in iPSCs ID USP12GCGTATCAATGGATACTTACATTGACTAATCCAAA 56CTAGTGCTCATTGACCGGAAACTGTTCTGGACCAA TCTCTTTCTC MDC1GCAATCCAGTGAATCCTTGAGGTGTAACGTGGAGC 57CAGTAAGGCGGCTACATATCTTTAGTGGTGCCCAT GGACCAGAAAAAG CREBBPCTGCACGGGGCTGCTGGCGCTCACATTTCCTATTC 58CTGGATTGATACTAGAGCCGCTGCCTCCTCGTAGA AGCTCCGACAG HPRTp-OZGACCTTGCCTTCATGTGATTCAGCCCCAGTCC 59ATTACCCTGTGTAGGACTGAGAAATGCAAGACTCT GGCTAGAGTTCCTTCTTCCATCTCCCZZC HPRTp-OZGACCTTGCCTTCATGTGATTCAGCCCCAGTCC 60 (PAMATTACaCTGTGTAGGACTGAGAAATGCAAGACTCT mutation)GGCTAGAGTTCCTTCTTCCATCTCCCZZC Asymmetric ssDNA donors SEQ(sense top strand, antisense bottom strand) ID GFP_PAM67-30p-Ozgaagttcatctgcaccaccggcaagctgcccgtgccctggcccacc 61ctcgtgaccaccttcacctacggcgtgcagtgcttcgcccgctaccOOg GFP_non-p-OEGGGTAGCGGGCGAAGCACTGCACGCCG 62 PAM67-30TAGGTGAAGGTGGTCACGAGGGTGGGCCAG GGCACGGGCAGCTTGCCGGTGGTGCAGATG AACTTOFGGFP_PAM30-67 p-Oogtgccctggcccaccctcgtgaccaccttcacctacggcgtgcag 63tgcttcgcccgctaccccgaccacatgaagcagcacgacttcttcaaEZc GFP_nonPAM30-p-EFCTTGAAGAAGTCGTGCTGCTTCATGTGG 64 67 TCGGGGTAGCGGGCGAAGCACTGCACGCCGTAGGTGAAGGTGGTCACGAGGGTGGGCCAGG GCAOEG GFP_PAM51-30p-Ooaccggcaagctgcccgtgccctggcccaccctcgtgaccacctt 65cacctacggcgtgcagtgcttcgcccgctaccOOg GFP_non-p-OEGGGTAGCGGGCGAAGCACTGCACGCCG 66 PAM51-30TAGGTGAAGGTGGTCACGAGGGTGGGCCAG GGCACGGGCAGCTTGCCGGZEG GFP_PAM30-51p-Oogtgccctggcccaccctcgtgaccaccttcacctacggcgtgcag 67tgcttcgcccgctaccccgaccacatgaagcFEc GFP_non-p-EOTGCTTCATGTGGTCGGGGTAGCGGGCGA 68 PAM30-51AGCACTGCACGCCGTAGGTGAAGGTGGTCAC GAGGGTGGGCCAGGGCAOEG GFP_PAM48-50p-Ooaccggcaagctgcccgtgccctggcccaccctcgtgaccaccttc 69acctacggcgtgcagtgcttcgcccgctaccccgaccacatgaagcagOFc GFP_non-p-EZGCTGCTTCATGTGGTCGGGGTAGCGGGCG 70 PAM48-50AAGCACTGCACGCCGTAGGTGAAGGTGG CAC GAGGGTGGGCCAGGGCACGGGCAGCTTGCCGG ZEGGFP_PAM40-40 p-Ofagctgcccgtgccctggcccaccctcgtgaccaccttcacctacgg 71cgtgcagtgcttcgcccgctaccccgaccacFZg GFP_non-p-OFTGTGGTCGGGGTAGCGGGCGAAGCACTGC 72 PAM40-40ACGCCGTAGGTGAAGGTGGTCACGAGGGTGGG CCAGGGCACGGGCAGCZZGNote: the PAM ssDNA oligonucleotide is defined as the gRNA targeting 73strand containing the NGG PAM sequence.Oligonucleotides for insertion of Flag SEQepitope tag plus EcoR1 site into BFP locus ID BFPinsO1p-EZgccctggcccaccctcgtgaccaccctgaccGACTACAAA 74 (ssDNA)GACGATGACGACAAGAATTCTtacggcgtgcagtgcttc agccgctaccccgacOFcBFPinsBF (blunt) p-EFCTACAAAGACGATGACGACAAGAATTOZt 75 BFPinsBR (blunt)p-FFGAATTCTTGTCGTCATCGTCTTTGTAEZC 76 BFP6nt5OFp-OZgaccGACTACAAAGACGATGACGACAAGA 77 ATTOZt BFP6nt5ORp-OECCGTAAGAATTCTTGTCGTCATCGTCTTT 78 GTAEZC BFP6nt3OFp-EFCTACAAAGACGATGACGACAAGAATTCT 79 tacgEOg BFP6nt3ORp-FFGAATTCTTGTCGTCATCGTCTTTGTAGTC 80 GGTOFG BFP15nt5OFp-EZgaccaccctgaccGACTACAAAGACGATGACGA 81 CAAGAATTOZt BFP15nt5ORp-EOACTGCACGCCGTAAGAATTCTTGTCGTCA 82 TCGTCTTTGTAEZC BFP15nt3OFp-EFCTACAAAGACGATGACGACAAGAATTCTta 83 cggcgtgcagZEc BFP15nt3ORp-FFGAATTCTTGTCGTCATCGTCTTTGTAGTCGG 84 TCAGGGTGGTOFC BFP30nt5OFp-EZgccctggcccaccctcgtgaccaccctgaccGACTACAAAG 85 ACGATGACGACAAGAATTOZtBFP30nt5OR p-EZGGTCGGGGTAGCGGCTGAAGCACTGCA 86CGCCGTAAGAATTCTTGTCGTCATCGTCTTT GTAEZC BFP30nt3OFp-EFCTACAAAGACGATGACGACAAGAATTCT 87 tacggcgtgcagtgcttcagccgctaccccgacOFcBFP30nt3OR p-FFGAATTCTTGTCGTCATCGTCTTTGTAGTCGG 88TCAGGGTGGTCACGAGGGTGGGCCAGGGOFC BFP24nt3OFp-EFCTACAAAGACGATGACGACAAGAATTCT 89 tacggcgtgcagtgcttcagccgctacOOcBFP24nt3OR p-FFGAATTCTTGTCGTCATCGTCTTTGTAGTCGG 90TCAGGGTGGTCACGAGGGTGGGOOA BFP30nt3OFn p-GACTACAAAGACGATGACGACAAGAATTCTT91 ACGGCGTGCAGTGCTTCAGCCGCTACCCCGAC CAC BFP30nt3ORnp-AAGAATTCTTGTCGTCATCGTCTTTGTAGTC 92 GGTCAGGGTGGTCACGAGGGTGGGCCAGGGC ACBFP36nt3OR P-FFGAATTCTTGTCGTCATCGTCTTTGTAGTCG 93GTCAGGGTGGTCACGAGGGTGGGCCAGGGCAC EEG BFP36nt3OFp-efCTACAAAGACGATGACGACAAGAATTCT 94 tacggcgtgcagtgcttcagccgctaccccgacofcBFP45nt3OF p-EFCTACAAAGACGATGACGACAAGAATTCTTA 95CGGCGTGCAGTGCTTCAGCCGCTACC CCGACCA CatgaagcagOFc BFP45nt3ORp-FFGAATTCTTGTCGTCATCGTCTTTGTAGTCGGT 96CAGGGTGGTCACGAGGGTGGGCCAG GGCACGG GCAGCTTGCOEGOligonucleotides for insertion of Flag epitope tag plus SEQEcoRI site into +5 position in EmGFP gene ID uGFPssT6_Flagp-EOccgtgccctggcccaccctcgtgaccacctGACTACAAA 97GACGATGACGACAAGAATTCTacggcgtgcagtgctt cgcccgctaccccEFc uGFP32nt5OT6fp-EOccgtgccctggcccaccctcgtgaccacctGACTACAAA 98 GACGATGACGACAAGAATTOZauGFP32nt5OT6r p-EZCGGGGTAGCGGGCGAAGCACTGCACGCC 99GTAGAATTCTTGTCGTCATCGTCTTTGTAEZC uGFP32nt3OT6fp-FEAATTCTTGTCGTCATCGTCTTTGTAGTCAG 100 GTGGTCACGAGGGTGGGCCAGGGCACGEECuGFP32nt3OT6r p-EFCTACAAAGACGATGACGACAAGAATTCT 101acggcgtgcagtgcttcgcccgctaccccEFcNote: Sense strand is defined as the gRNA targeting strand with NGG PAM site.p: 5′-phosphate; F: Phosphorothioate-A; O: Phosphorothioate-C;E: Phosphorothioate-G; Z: Phosphorothioate-TPrimers for amplification of genomic loci SEQ Locus gRNA Target ID BFPCTCGTGACCACCCTGACCCACGG 102 GFP (−39) CTGAAGTTCATCTGCACCACCGG 103GFP (−34) GGGCACGGGCAGCTTGCCGGTGG 104 GFP (−31) CCAGGGCACGGGCAGCTTGCCGG105 GFP (−20) GGCAAGCTGCCCGTGCCCTGG 106 GFP (−19)CACGAGGGTGGGCCAGGGCACGG 107 GFP(−14) GGTGGTCACGAGGGTGGGCCAGG 108 GFP(−7)GCCGTAGGTGGTCACGAGGGTGG 109 GFP (−3) GCACGCCGTAGGTGGTCACGAGG 110GFP (+3) CCCACCCTCGTGACCACCTACGG 111 GFP (+5) GAAGCACTGCACGCCGTAGGTGG112 GFP (+8) GGCGAAGCACTGCACGCCGTAGG 113 GFP (+21)CTTCATGTGGTCGGGGTAGCGGG 114 GFP (+30) GTCGTGCTGCTTCATGTGGTCGG 115GFP (+34) AGAAGTCGTGCTGCTTCATGTGG 116 HPRT GCATTTCTCAGTCCTAAACAGGG 117CREEP AGCGGCTCTAGTATCAACCC 118 MDC1 AAGATATGTAGCCGCCCTAC 119 USP12CCGGTCAATGAGCACTATTT 120

TABLE 5 EmGFP and BFP sequences.(A) Wild type EmGFP sequence (SEQ ID No: 121)

(B) A disrupted EmGFP sequence (SEQ ID No: 122)

Note:In a disrupted EmGFP sequence, a sequence of “cacctt” was deleted. Underlinedsequences were used for amplification of EmGFP locus and for preparation of dsDNA donor.(C) Wild type BFP sequence (SEQ ID No: 123)

(D) Conversion of BFP sequence to GFP sequence by one mutation (SEQ ID No: 124)

TABLE 6 Effect of sequential delivery of Cas9 RNP and PAM or non-PAMssDNA donor on HDR (FIG. 7B data). RD RDx2 R−>D D−>R PAM Avg 4.9 6.210.9 12.3 ssDNA donor (% GFP+ cells) Std 0.2 0.0 0.5 1.7 non-PAM Avg 6.18.9 14.0 16.0 ssDNA donor (% GFP+ cells) Std 1.7 0.9 1.1 0.8

TABLE 7 Effect of dose on HDR (FIG. 7C data). neg (+)gRNA (−)gRNA 0.05μg 0.1 μg 0.2 μg 0.5 μg 1 μg Avg (% 0.00 0.04 0.01 1.4 6.5 12.0 12.7 7.2GFP + cells) Std 0.00 0.02 0.00 0.2 1.5 1.1 0.4 0.4

TABLE 8 Effect of sequential delivery of Cas9 RNP and dsDNA donor on HDR(FIG. 7D data). neg (+)gRNA (−)gRNA RD R−>D Avg (% GFP+ cells) 0.05 0.180.09 4.25 10.36 Std 0.01 0.04 0.01 0.35 0.30

TABLE 9 Effect of sequential delivery on HDR using an alternativereporter gene (FIG. 7E data). neg (+)gRNA (−)gRNA RD R − >D D − >R Avg(% 0.14 0.13 0.02 22.75 32.85 31.55 GFP + cells) Std 0.01 0.04 0.00 0.781.06 0.78

TABLE 10 Optimization of delivery of Cas9 RNP. neg Pstd P2 P3 P4 P5 P6P7 P8 P9 P10 P11 0 78 72 75 79 80 63 69 74 76 59 69 P12 P13 P14 P15 P16P17 P18 P19 P20 P21 P22 P23 P24 71 69 74 76 78 40 54 74 76 74 77 80 77

TABLE 11 Optimization of sequential delivery of Cas9 RNPs and ssoligonucleotide. neg Cas9 RNP Cas9/D Pstd Pstd Pstd P2 0.08 0.38 0.1217.86 15.33 13.04 14.32 P3 P4 P5 P6 P7 P8 P9 10.48 6.03 1.20 13.25 13.6212.14 7.84 P10 P11 P12 P13 P14 P15 P16 12.22 13.25 9.82 12.71 11.77 7.090.39 P17 P18 P19 P20 P21 P22 P23 P24 10.41 11.72 4.62 1.08 9.98 1.990.46 0.01

TABLE 12 Optimization of delivery of Cas9 RNP and dsDNA. neg Cas9 RNPCas9/D Pstd P2 P3 P4 P5 P6 0.045 0.13 0.011 9.26 8.468 2.8 0.58 0.17810.6 P7 P8 P9 P10 P11 P12 P13 P14 P15 9.868 3.857 0.725 10.15 9.489 2.89.03 7.976 1.868 P16 P17 P18 P19 P20 P21 P22 P23 P24 0.574 4.89 7.769.22 2.5 6.1 3.54 0.86 0

Results

Effect of Sequential Delivery of Nuclease and Donor DNA

The sequential delivery of Cas9 RNP followed by donor DNA resulted inmore than a two-fold increase in EmGFP-positive cells regardless of theuse of ssDNA or dsDNA donor (FIG. 7B and FIG. 7D). The reversesequential delivery of ssDNA donor first and then Cas9 RNP exhibited asimilar effect (FIG. 7B). However, two consecutive electroporationswithout the intermediate wash step only showed mild improvement over theco-delivery of Cas9 RNP and ssDNA donor. The use of non-PAM strand donorexhibited slightly higher HDR efficiency than the PAM strand donor (FIG.7B). The effect of sequential delivery was also observed in anotherreporter cell line system in which eBFP was converted to GFP by a singlenucleotide substitution (FIG. 7E). The dosage titrations of ssDNA donorindicated that the optimal amount of ssDNA oligonucleotide was 0.2 to0.5 μg per 10 μl reaction, which represented approximately 10 pmol ofssDNA oligonucleotide in 10 μl reaction (FIG. 7C). Upon optimization, weobserved approximately 15% EmGFP-positive cells using the non-PAM ssDNAoligonucleotide (FIG. 7C and Table 11). The HEK293 cell viability forco-delivery of Cas9 RNPs and donor DNA was approximately 85%, whereasthe cell viability for sequential delivery of Cas9 RNPs and donordecreased to approximately 50%. The optimal electroporation conditionwas highly dependent on cell type and should be determinedexperimentally.

Effects of Oligonucleotide Length and Modification on HDR

It has been reported that relatively short single-strandedoligonucleotides containing 25-61 bases homologous to the targetsequence were capable of correcting a single point mutation (Igoucheva,O., Alexeev, V., Yoon, K., Targeted gene correction by smallsingle-stranded oligonucleotides in mammalian cells, Gene Ther.8(5):391-399 (2001)). The use of phosphorothioate modification ofnucleotides has also shown to prevent degradation of oligonucleotidetherapeutic agents in serum and cells (Brown, D. A., Kang, S. H.,Gryaznov, S. M., DeDionisio, L., Heidenreich, O., Sullivan, S., Xu, X.,Nerenberg, M. I., Effect of phosphorothioate modification ofoligodeoxynucleotides on specific protein binding, J. Biol. Chem.269(43):26801-26805 (1994). Here we examined the effect ofoligonucleotide length and modification on HDR efficiency in our system.The oligonucleotides were chemically synthesized and PAGE-purified withand without phosphorothioate modification at both the 5′ and 3′ ends andphosphate modification at the 5′ end with a total length that variedfrom 40 to 100 bases. The desired mutation was positioned at the centerof the oligonucleotide. As shown in FIG. 8A and FIG. 8B, the optimallength of ssDNA oligonucleotide was approximately 80 bases, harboring36-40 bases of homology arm on each side. Oligonucleotides shorter than60 bases reduced the HDR efficiency significantly, whereas the 100-baseoligonucleotides also showed a slightly decreased efficiency. Thephosphorothioate modification improved the efficiency of a 6-baseinsertion, although it had only a mild effect on one-base substitution.Using an 80-base modified oligonucleotide, we observed approximately 45%GFP-positive cells while introducing a single nucleotide substitution.To confirm the HDR result obtained from flow cytometry, the PCRfragments were cloned and 96 clones were randomly picked for sequencing.As shown in FIG. 8C and FIG. 8D, approximately 6% of the coloniescontained the wild type sequence, indicating that the overall genomemodification efficiency was nearly 94%. Among the 94% edited cells,approximately 54% of the cells harbored insertions and deletions. In afew cases, a duplication of an 18-base sequence was inserted by anunknown mechanism (FIG. 8D, Clone No. 2). Approximately 40% of theclones contained the correct point mutation, which was in agreement withthe GFP reporter assay.

TABLE 13A Effect of oligonucleotide length and modification on HDR (FIG.8A data). neg (+)gRNA (−)gRNA 40 PS40 60 PS60 Equal Avg (% 0.04 0.260.09 2.96 4.35 6.62 9.59 Mass GFP + cells) Std 0.01 0.23 0.07 0.44 0.711.92 0.88 Equal Avg (% 0.04 0.26 0.09 1.38 1.79 6.63 9.28 Molarity GFP +cells) Std 0.01 0.23 0.07 0.08 0.57 0.04 0.40

TABLE 13B Effect of oligonucleotide length and modification on HDR (FIG.8A data). 79 PS79 97 PS97 Equal Avg (% 7.99 12.94 7.53 10.98 Mass GFP+cells) Std 0.45 1.08 1.10 0.16 Equal Avg (% 7.62 12.53 7.31 11.01Molarity GFP+ cells) Std 0.31 0.40 0.30 1.41

TABLE 14 Effect of oligonucleotide length and modification on HDR usingan alternative reporter gene (FIG. 8B data). 50 PS50 60 PS60 70 PS70 80PS80 90 PS90 100 PS100 Avg (% 14.4 26.4 28.4 34.4 34.7 39.2 38.4 44.332.8 38.8 28.8 38.4 GFP + cells) Std 1.4 1.1 0.3 3.4 2.9 3.4 2.3 0.9 3.45.1 2.3 2.3

TABLE 15 Sequencing verification (FIG. 8C data.) wt NHEJ HDR Relativepercentage 5.8 53.8 40.4

Double Strand Breaks in the Immediate Vicinity of the Altered LocusFacilitates HDR

In the design of gRNAs for homologous recombination, it was previouslyrecommended to introduce the cleavage site in close proximity to thealtered locus (Inui, M., Miyado, M., Igarashi, M., Tamano, M., Kubo, A.,Yamashita, S., Asahara, H., Fukami, M., Takada, S., Rapid generation ofmouse models with defined point mutations by the CRISPR/Cas9 system.Sci. Rep. 4:5396 (2014)). However, the ability to accomplish this woulddepend on the availability of a PAM site near the altered locus. To testthis, we designed a set of 12 gRNAs flanking the 6-base insertion sitein EmGFP (FIG. 9A). The gRNAs were enzymatically synthesized by theGENEART™ Precision gRNA Synthesis Kit and then complexed with PLATINUM™Cas9 nuclease protein. The resulting Cas9 RNPs were delivered to cellsby electroporation and the genome cleavage efficiencies were determinedat 48 hours post transfection. As shown in FIG. 9B, some gRNAs were moreactive than others and no distinct pattern was observed. To evaluate theeffect of distance between the DSB and the altered locus on HDR, wedelivered Cas9 RNP and ssDNA or dsDNA donor into cells sequentially andthen determined the percentage of EmGFP-positive cells using flowcytometry. As depicted in FIG. 9C, the gRNAs (−3, +3, and +5) in closeproximity to the insertion site produced the highest percentages ofEmGFP-positive cells. Although the +3 gRNA was closer to the insertionsite than the +5 gRNA, the +3 gRNA exhibited lower HDR efficiency thanthe +5 gRNA likely because the genome cleavage efficiency of the +3 gRNAwas two-fold lower than that of the +5 gRNA (FIG. 9B and FIG. 9C). Underoptimal conditions, we observed more than 30% EmGFP-positive cells usingthe −3 or +5 gRNAs, which represented more than 200-fold increase inknock-in efficiency over the donor-only negative control.

TABLE 16A Various genome cleavage efficiency with different gRNAs (FIG.9B data). −34 −31 −20 −19 −14 −7 −3 (+)3 (+)5 Avg 69.00 66.00 66.0035.00 39.50 67.50 71.00 30.00 69.50 (% Indel) Std 1.41 1.41 1.41 2.832.12 2.12 1.41 1.41 2.12

TABLE 16B Various genome cleavage efficiency with different gRNAs (FIG.9B data). (+)8 (+)21 (+)30 Avg 75.00 13.50 49.00 (% Indel) Std 1.41 2.122.83

TABLE 17A Double Strand Breaks in the immediate vicinity of the alteredlocus facilitates HDR (FIG. 9C data). Neg (+)gRNA (−)gRNA −34 −31 −20−19 dsDNA Avg (% 0.08 0.23 0.10 3.46 1.86 2.15 1.42 GFP+ cells) Std 0.010.04 0.01 0.04 0.02 0.07 0.28 ssDNA Avg (% 0.08 0.23 0.10 1.78 0.88 1.451.86 GFP+ cells) Std 0.01 0.04 0.01 0.06 0.06 0.21 0.13

TABLE 17B Double Strand Breaks in the immediate vicinity of the alteredlocus facilitates HDR (FIG. 9C data). −14 −7 −3 (+)3 (+)5 (+)8 (+)21(+)30 dsDNA Avg (% 1.33 7.75 14.35 4.78 13.55 3.36 1.58 1.59 GFP+ cells)Std 0.14 0.07 0.21 0.49 1.06 0.62 0.25 0.49 ssDNA Avg (% 2.45 19.9029.80 10.25 29.75 8.94 2.06 0.66 GFP+ cells) Std 0.16 0.57 0.99 0.490.35 0.45 0.18 0.18

Asymmetric PAM and Non-PAM ssDNA Donors Facilitate HDR

A recent report showed that an asymmetric ssDNA donor, complementary tothe target strand with 36-bases on the PAM-distal side and a 91-baseextension on the PAM-proximal side of the break, enhanced HDR efficiency(Richardson et al., Nat. Biotechnol. 34:339-344 (2016)). It was proposedthat when Cas9 cleaved the target loci, the 3′ end of the PAM-distalstrand could dissociate from the RNP/DNA complex and initiate HDR byannealing to a donor complementary to this exposed sequence, suggestingthat a donor designed in this manner would be preferred. However, weobserved only a slight difference in HDR efficiency between thesymmetric PAM (corresponding to the non-target strand in Richardson etal., Nat. Biotechnol. 34:339-344 (2016)) and non-PAM (corresponding tothe target strand in Richardson et al., Nat. Biotechnol. 34:339-344(2016)) strands (FIG. 7B). To further understand the mechanism of ssDNAdonor-mediated HDR, we designed gRNAs to introduce DSBs both upstream(−3 gRNA) and downstream (+5 gRNA) of the insertion site (FIG. 10A).Furthermore, we designed a set of asymmetric PAM strand and non-PAMstrand ssDNA donors with 30-bases on one homology arm and 51-bases or67-bases on the other homology arm (FIG. 10B and FIG. 14). The PAMstrand was defined as the strand containing the NGG PAM sequence. Thesymmetric ssDNA donors served as controls. The percentages of GFP+ cellsdetermined by flow cytometry were plotted separately for each individualgRNA (FIG. 14 and see Table 18). For clarification, only a subset ofasymmetric ssDNA donors were shown in FIG. 10B and FIG. 10C with thepercentage of GFP+ cells normalized to the percentage cleavageefficiencies (FIG. 9B).

TABLE 18 Both asymmetric PAM and non-PAM ssDNA donors facilitate HDR(Abbreviations: NP = non-PAM, P = PAM) (FIG. 14 data). 30-51 40-40 50-3030-67 48-50 67-30 P NP P NP P NP P NP P NP P NP (+3) Avg 4.0 11.3 6.46.7 8.7 4.1 6.1 11.4 8.5 9.5 13.9 3.2 gRNA (% GFP+ cells) (+3) Std 1.00.5 0.5 0.4 0.9 0.4 0.6 0.4 0.2 0.7 0.6 0.7 gRNA (−3) Avg 34.7 23.7 25.614.0 24.6 30.0 38.2 14.1 32.3 18.3 12.4 32.5 gRNA (% GFP+ cells) (−3)Std 2.6 1.8 1.0 3.2 1.0 3.1 1.7 2.2 0.7 3.0 1.9 2.9 gRNA (+5) Avg 30.620.6 23.9 23.6 19.7 30.8 32.9 20.2 29.9 26.0 10.5 38.0 gRNA (% GFP+cells) (+5) Std 1.6 7.6 3.2 5.5 3.1 3.9 2.2 3.1 2.4 3.9 2.3 1.2 gRNA

When the either the −3 or +5 gRNAs were used to generate DSB with itsPAM site located upstream or downstream (respectively) of the insertionsite (FIG. 10A), the asymmetric PAM strand ssDNA donors 67-30 andnon-PAM strand donor 67-30 (FIG. 10A and FIG. 10B respectively) yieldedthe highest HDR efficiency (shown in FIG. 10C). This suggests that a 30base 3′ homology arm is favored over longer arms of 67 bases on the3′end. This fits the model described in FIG. 10A where the resected DSBallows access to a 3′ overhang for annealing. Supporting this notion isthe data obtained with both with the PAM strand 30-67 and the non-PAM30-67 donors where editing efficiency was significantly less (FIG. 10C).The data from the −3 gRNA agrees with the +5 results showing thatplacement of the insertion/SNP relative to the DSB has a small butmeasureable effect on efficiency. Here it is suggested that when theedited area is either upstream or downstream of the cut, HDR with adonor that anneals to the template side containing the edit site isslightly inhibited potentially due to needing to overcome the mismatchbetween the original sequence and the donor sequence (See Table 21).Finally, the symmetrical donors shown in FIGS. 10A-10C (and in FIG.14A-14B) show an intermediate efficiency suggesting the optimal 3′homology arm length in this model system could be near 30 bases butlikely not as much as 50 bases.

TABLE 19 Both asymmetric PAM (P) and non-PAM (NP) ssDNA donorsfacilitate HDR (FIG. 10 data). 30-67 48-50 67-30 P NP P NP P NP (−3)gRNAAvg (% 54.3 20.0 46.3 26.1 17.7 46.5 GFP+ cells) (−3)gRNA Std 2.0 2.02.3 1.5 2.8 2.1 (+5)gRNA Avg (% 48.5 29.6 43.9 38.2 15.4 55.9 GFP+cells) (+5)gRNA Std 2.0 5.1 3.4 3.1 2.2 2.1

The results were further validated using the same reporter system asdescribed in Richardson et al., Nat. Biotechnol. 34:339-344 (2016), inwhich a gRNA targeted the eBFP gene. The asymmetric donors with a short35 base on the 3′ end that could anneal to the resected 3′ end of thegenomic DSB performed better with the PAM 65-35 and non-PAM 65-35resulting in approximately 52% and 48% HDR efficiency respectively (seeTable 20), whereas the asymmetric donors with a long 65 base on the 3′end were less effective with PAM 35-65 and non-PAM 35-65 resulting in32% and 21% GFP+cells respectively. In addition, similar results wereseen when using a cas9 mRNA and asymmetric gRNAs.

TABLE 20 Both asymmetric PAM (P) and non-PAM (NP) ssDNA donor facilitateHDR. 40-40 35-65 50-50 65-35 50-50 35-65 P NP P P P NP NP NP Avg 44.642.2 32.6 41.5 51.7 47.4 28.5 21.2 (% GFP+ cells) Std 0.5 0.5 1.1 0.71.0 1.2 2.4 1.6

TABLE 21 Insertion of Flag tag. 32-5′ 32-3′ ssDNA wt NHEJ HDR wt NHEJHDR wt NHEJ HDR Relative Avg 22.0 78.0 0.0 11.1 54.6 34.2 18.2 63 18.8 %Std 2.0 2.0 0.0 2.9 2.0 3.0 4.0 4 4.5

Overall, the use of either the asymmetric PAM strand or non-PAM strandssDNA donor, which harbors approximately 65-67 bases of homology on the5′ end and 30-35 bases of homology on the 3′ end, resulted in thehighest efficiency of HDR regardless of which genomic strand containedthe PAM or whether the DSB was upstream or downstream of the edit site,inferring a common intermediate for HDR. Contrary to the proposed modelof Richardson et al., Nat. Biotechnol. 34:339-344 (2016), we saw no biasin donor design favoring the genomic strand that is proposed to bereleased by the Cas9 complex.

Short Double-stranded DNA Donor with Single-stranded OverhangsFacilitates Highly Efficient HDR

The work of asymmetric ssDNA donors described above suggested that onlyabout 30 bases at the 3′ end were needed for sufficient single strandedDNA annealing. To extend this concept, we hypothesized that a dsDNAdonor harboring single-stranded overhangs would facilitate HDR to higherlevels than with blunt ends. To test this hypothesis, we designed andgenerated a series of donor molecules with either blunt end, 5′ endprotrusion or 3′ end protrusion by annealing two small single-strandedoligonucleotides. A single-stranded DNA donor was used as a control. The5′ and 3′ ends of oligonucleotides were protected with two consecutivephosphorothioate-modified bases (Table 3). For proof of concept, weinserted a 30 nucleotide FLAG epitope tag along with an EcoRI site intothe BFP gene stably expressed in HEK293 cells. The gRNA was designed totarget the top DNA strand. The length of single-stranded overhangsvaried from 6 nucleotides to 30 nucleotides. The oligonucleotides weredenatured and re-annealed prior to transfection forming the structuresdescribed in FIG. 11A. The Cas9 RNP and donor DNA were deliveredsequentially to HEK293 cells by electroporation. Co-delivery of Cas9RNPs and donor DNA gave lower HDR efficiencies. At 48 hours posttransfection, the genomic locus was PCR-amplified and the editingefficiencies were determined using a GCD assay. As shown in FIG. 11A,approximately 75% cleavage efficiencies were observed with various donorconfigurations. When the PCR fragments were subjected to restrictiondigestion with EcoRI to identify properly inserted constructs, only thedonor DNA molecules containing 30-base single-stranded overhangs at the3′ ends produced the expected digested fragments (30-3′ in FIG. 11A).The double stranded donor with 30-base 3′ overhangs was inserted withefficiencies above 35% while the ssDNA donor was inserted withapproximately 20% efficiency. Upon close examination of the length ofthe single-stranded overhang, we found that the donor DNA molecules with24-base 3′-protruded ends produced approximately 15% digestionefficiency while seemingly optimal length of single-stranded overhangswas 30 to 36 nucleotides with a digestion efficiency of approximately40%. The use of a 45-base single-stranded donor decreased the efficiencyslightly (FIG. 11B). Comparing two donor DNA molecules containing30-base single-stranded overhangs at the 3′ ends with and withoutphosphorothioate modification, the donor DNA with phosphorothioatemodification at both 5′ and 3′ ends exhibited approximately 42%digestion efficiency (30-3′) whereas the donor DNA withoutphosphorothioate modification (30-3′n) lowered the efficiency to around27% (FIG. 11B). Furthermore, we titrated the amount of ssDNA donor anddsDNA donor with 30-base 3′ end protrusion. As depicted in FIG. 11C, theoptimal concentration of DNA donors in the transfection reaction wasapproximately 1 μM. Under these conditions, we measured greater than 40%digestion efficiency with the 3′-protruded dsDNA donor, which was nearly10% higher than that using ssDNA. When we performed sequencing analysisof 192 clones, 13% of the clones were wild type and 48.8% of the clonescontained indels suggesting that in these clones the cleavage wasrepaired by NHEJ. Although 39% of the clones contained the insert, amongthem about 4% of the clones harbored a point mutation, most likely dueto an error in the synthetic DNA oligonucleotide (FIG. 11D, whiterectangle). Excluding all the errors and wild type clones, 34% of theclones harbored the correct insertion.

We also analyzed the edited locus where the ssDNA oligonucleotide servedas donor. In this case, approximately 9% of the clones were wild type,61% of the clones were NHEJ, and 30% of the clones were HDR (FIG. 11D).Among the 30% HDR, 9% of the clones harbored the insertion but with asingle base mutation. After excluding the errors, 21% of the clonesharbored the correct insertion.

TABLE 22A Short double-stranded DNA donor with single-stranded overhangsfacilitates HDR (FIG. 11A data). (+) (−) Neg gRNA gRNA blunt 6-5′ 6-3′15-5′ 15-3′ Avg (% 0.00 77.00 0.00 68.50 56.50 78.50 69.50 81.00 Indels)Std 0.00 2.83 0.00 2.12 4.95 0.71 0.71 1.41 Avg (% 0.00 0.00 0.00 0.000.00 0.00 0.00 0.50 digestion) Std 0.00 0.00 0.00 0.00 0.00 0.00 0.000.00

TABLE 22B Short double-stranded DNA donor with single-stranded overhangsfacilitates HDR (FIG. 11A). 30-5′ 30-3′ ssDNA Avg (% 80.00 81.00 77.50Indels) Std 0.00 4.24 3.54 Avg (% 1.00 37.50 23.50 digestion) Std 0.004.95 3.54

TABLE 23 Effect of length of 3′ overhang on percentage of digestion(FIG. 11B). 15-3′ 24-3′ 30-3′n 30-3′ 36-3′ 45-3′ Avg (% 0.8 17.3 31.543.1 41.7 37.0 digestion) Std 0.4 1.1 0.7 1.6 2.3 4.2

TABLE 24 Dose effect of ssDNA donor and 3′ overhanged short dsDNA donor(FIG. 11C). 30-3′ ssDNA [DNA] (μM) Avg (% digestion) Std Avg (%digestion) Std 3.00 28.50 2.12 7.60 1.98 2.00 39.00 2.83 1.65 22.00 4.241.00 41.50 2.12 27.00 4.24 0.66 24.50 2.12 0.50 37.00 0.00 20.00 1.41

TABLE 25 Sequencing analysis (FIG. 11P). Relative HDR with pointpercentage mutation Total 3′ overhanged wt 12.7 0 12.7 donor NHEJ 48.8 048.8 HDR 34.1 4.4 38.5 ssDNA donor wt 9 0 9 NHEJ 61.2 0 61.2 HDR 20.59.3 29.8

In order to understand the polarity of dsDNA donor with single strandedoverhangs, we inserted a FLAG epitope tag along with an EcoRI site intoa separate locus where the +5 gRNA was targeting the bottom strand ofEmGFP gene (see Table 21). The Cas9 RNPs were first delivered intoGRIPTITE™ HEK293 cells, followed by the delivery of ssDNA donor, orshort dsDNA donors with 32-base single stranded overhangs at either 5′or 3′ end. Samples in the absence of gRNA served as controls. At 48hours post transfection, the genomic loci were amplified by PCR. Theresulting PCR fragments were subjected to restriction digestion withEcoRI (data not shown) or subjected to sequencing analysis. Theintegration efficiency of FLAG epitope tag along with an EcoRI wasapproximately 34% using dsDNA donor with 3′ overhangs and 20% usingssDNA donor. The dsDNA donor with 5′ overhangs resulted in a barelydetectable integration product. The results indicated that the polarityof single stranded overhangs remained the same regardless of how theDSBs were introduced by Cas9 RNPs.

Discussion

We have demonstrated that mammalian cells are fully capable of carryingout homology directed end repair efficiently without exogenousinhibition of the non-homologous end-joining pathway. The design anddelivery of gRNAs, Cas9 nuclease, and donor molecules are critical toachieve high HDR efficiencies. Ideally, in order to achieve high editingefficiency, the double-stranded break induced by Cas9 nuclease should bein close proximity to the edit site, as just a few additional basesfurther up- or downstream can make a significant difference in editingefficiency. One limitation of the CRISPR system for precise editing isexposed here since the location of a potential DSB site, andconsequently the efficiency of donor insertion to the genome, isdictated by the availability of PAM sites relatively near the intendededit. Further, even though a gRNA target site happens to be in theimmediate vicinity of the edit locus, it is not guaranteed to have highmodification efficiency because the gRNA activity may depend on thenature of the gRNA sequence, chemical modification, as well as itsaccessibility to the genomic locus. Finally, the chance of off-targetcutting for each gRNA must be considered. In this regard, alternatetools such as TALENs mutated to lack the 5′T targeting requirement, orrecently potentially N. gregoryi Argonaute (REF) have an inherentadvantage over CRISPR in they can be programed to target virtuallyanywhere in the genome with no PAM restrictions.

If the Cas9 RNPs are efficiently delivered to cells for induction ofdouble stranded breaks and the donor molecules are readily available atthe time of DNA repair, the HDR pathway can be nearly as efficient asthe NHEJ pathway. The HDR frequencies depend on the dose of donor DNAmolecules with the optimal delivery concentration being approximately 1μM. The optimal length of ssDNA donor is approximately 70 to 100nucleotides, having a 35-50 base homology arm on either side of the editsequence. The protection of donor DNA with phosphorothioate modificationimproves HDR efficiency in our model system. The delivery conditions forCas9 RNPs and donors are also crucial as we observe that sequentialdelivery of Cas9 RNPs and donor DNA facilitates HDR. This is may be dueto the Cas9 protein having non-specific DNA binding activity, leading todecreased transfection efficiency when paired with donor. However,sequential delivery is not applicable to cells that are sensitive tomultiple rounds of electroporation, such as iPSC. In iPSC, theco-delivery of Cas9 RNPs and ssDNA donor produced up to 24% HDRefficiency (data not shown). The use of Cas9-expressing cells can bebeneficial for genome editing because the delivery of Cas9 nuclease isnot necessary, resulting in increased transfection efficiency of gRNAand/or donor DNA. For example, we observed precise genome editing ratesof up to 40% in Cas9-expressing iPSCs for a single nucleotidesubstitution at multiple genomic loci (data not shown). However, extraeffort is required to generate the stable cell lines expressing Cas9nuclease with the added risk for a higher off-target effect.

The donor design and configuration also contribute to the editingefficiency. A recent report showed that asymmetric design of ssDNAdonors promoted HDR by overlapping the Cas9 cut site with 36-bases onthe PAM-distal side and with a 91-base extension on the PAM-proximalside of the break. A donor DNA complementary to the non-target strandstimulated HDR frequencies up to 2.6-fold greater than those obtainedwith a donor DNA complementary to the target strand (Richardson et al.,Nat. Biotechnol. 34:339-344 (2016)). However, we observe that both theasymmetric PAM strand (corresponding to the non-target strand inRichardson et al., 2016) and non-PAM strand (corresponding to the targetstrand in Richardson et al., 2016) enhance HDR regardless of theorientation of the cas9 nuclease. Thus, we propose that Cas9 nucleasecleaves and both sides of the double-stranded break are recognized bythe DNA repair machinery equally. In this model, a repertoire ofcellular proteins involved in DNA repair is recruited to the broken endsto rectify the damaged DNA via either NHEJ (FIG. 12A) or the HDR pathway(FIG. 12B-FIG. 12D). In order for HDR-mediated donor insertion to occur,cellular exonucleases excise the ends from 5′ to 3′ thereby generating3′ overhangs on either side of the break (Nimonkar, A. V., Ozsoy, A. Z.,Genschel, J., Modrich, P., Kowalczykowski, S. C., Human exonuclease 1and BLM helicase interact to resect DNA and initiate DNA repair, PNAS105(44):16906-16911 (2008)), which can anneal to the 3′ end of either aPAM or non-PAM ssDNA donor. In either case, one of the 3′ recessive endsof the break will anneal with ssDNA donor and then be extended by DNApolymerase with the 5′ end of the ssDNA donor serving as template (FIG.12B). However, it is unclear how the other 3′ recessive end of the breakbridges the newly-extended dsDNA and to repair the lesion (Kan, Y.,Ruis, B., Hendrickson, E. A., The Mechanism of Gene Targeting in HumanSomatic Cells, PLOS Genetics 10(4):e1004251 (2014)). It appears that30-36 nucleotides are sufficient for single stranded DNA annealing ateach end. The polarity and spacing of 3′ recessive ends are furtherconfirmed by the use of short dsDNA donors with single strandedoverhangs. By annealing two single-stranded DNA oligonucleotides, wecreate different configurations of donor DNA molecules. Interestingly,short dsDNA molecules, ideally with 30 to 36-base single-strandedoverhangs at the 3′ ends, appear to be used efficiently in the HDRpathway regardless of whether the gRNA targets the top or bottom strand.These results support a model whereby after the DSB is made a common “3′recessed ends” intermediate is formed by the HDR machinery and can beused with many complementary donor DNA molecules containing matching 3′homology arms. This model and associated data favor the design of donorDNAs with the insertion or SNP repair element as close to the cleavagesite as possible and 3′ protruding single strand homology arms ofapproximately 30-36 bases for larger donor molecules. For smaller singlestranded donor molecules, 30-35 base arms 3′ to the insertion/repaircassette and greater than 40 bases on the 5′ end seems to be favored.

While the foregoing embodiments have been described in some detail forpurposes of clarity and understanding, it will be clear to one skilledin the art from a reading of this disclosure that various changes inform and detail can be made without departing from the true scope of theembodiments disclosed herein. For example, all the techniques,apparatuses, systems and methods described above can be used in variouscombinations.

1. A method for performing homologous recombination, the methodcomprising: (a) generating a double-stranded break in a nucleic acidmolecule present inside a cell to produce a cleaved nucleic acidmolecule, and (b) contacting the cleaved nucleic acid molecule generatedin (a) with a donor nucleic acid molecule, wherein the cleaved nucleicacid molecule and the donor nucleic acid molecule each contain matchedtermini on at least one end, wherein the matched termini on at least oneend of the cleaved nucleic acid molecule and the donor nucleic acidmolecule is at least ten nucleotides in length, and wherein the matchedregion of the cleaved nucleic acid molecule is single-stranded ordouble-stranded and the matched region of the donor nucleic acidmolecule is single-stranded.
 2. The method of claim 1, wherein thematched termini on at least one end of the cleaved nucleic acid moleculeand the donor nucleic acid molecule have 5′ overhangs or 3′ overhangs.3. The method of claim 1, wherein the matched termini on at least oneend of the cleaved nucleic acid molecule and the donor nucleic acidmolecule have one 5′ overhang and one 3′ overhang. 4.-7. (canceled) 8.The method of claim 1, wherein the cleaved nucleic acid molecule has atleast one terminus with a single-stranded region.
 9. The method of claim1, wherein the double-stranded break in the nucleic acid moleculepresent inside the cell is generated by the formation of two nicks, onein each strand of the nucleic acid molecule.
 10. The method of claim 9,wherein the cleaved nucleic acid molecule has at least one bluntterminus.
 11. (canceled)
 12. The method of claim 1, wherein donornucleic acid molecule contains one or more nuclease resistant groups inat least one strand of at least one terminus.
 13. The method of claim12, wherein donor nucleic acid molecule contains one or more nucleaseresistant groups in both strands of both termini 14.-15. (canceled) 16.The method of claim 1, wherein the donor nucleic acid molecule hasasymmetric termini. 17.-30. (canceled)
 31. A method for performinghomologous recombination in a population of cells, the methodcomprising, (a) contacting the population of cells with a nucleic acidcutting entity under conditions that allow for the generation ofdouble-stranded break at a target locus in nucleic acid present insidecells of the population, to produce cells containing an intracellularcleaved nucleic acid molecule, and (b) introducing a donor nucleic acidmolecule into cells generated in step (a) with under conditions thatallow for homologous recombination to occur, wherein homologousrecombination occurs at the target locus in at least 20% of the cells ofthe population, and wherein the target locus and/or the donor nucleicacid molecule have one or more of the following characteristics: (a) thetarget locus and the donor nucleic acid molecule share at least onematched terminus, (b) the donor nucleic acid molecule contains one ormore nuclease resistant group, (c) donor nucleic acid molecule hasasymmetric termini, (d) the target locus cut site is within 15nucleotides of the location where alteration is desired, (e) the nucleicacid cutting entity, or components thereof, and the donor nucleic acidmolecule are contacted with the cells of the population at differenttimes, and/or (f) the amount of the donor nucleic acid moleculecontacted with cells of the population is in a range that allows forefficient uptake and homologous recombination. 32.-39. (canceled) 40.The method of claim 31, wherein the cells of the population arecontacted with the nucleic acid cutting entity, or components thereof,before the cells of the population are contacted with the donor nucleicacid molecule.
 41. The method of claim 31, wherein the cells of thepopulation are contacted with the nucleic acid cutting entity, orcomponents thereof, for between 5 and 60 minutes before the cells of thepopulation are contacted with the donor nucleic acid molecule.
 42. Themethod of claim 31, wherein the donor nucleic acid molecule contains oneor more nuclease resistant group at one or more terminus. 43.-46.(canceled)
 47. The method of claim 31, wherein the target locus cut siteis within 10 nucleotides of the location where alteration is desired.48. The method of claim 31, wherein the target locus cut site comprisesa single stranded region that includes all or part of the location wherealteration is desired.
 49. The method of claim 31, wherein thesingle-stranded region contains a single mismatched nucleotide betweenthe target locus and the donor nucleic acid molecule.
 50. The method ofclaim 31, wherein the amount of donor nucleic acid is between 50 and 900ng per 1×10⁵ cells.
 51. The method of claim 31, wherein donor nucleicacid molecules are introduced into cells of the population byelectroporation or transfection.
 52. A method for performing homologousrecombination in a cell, the method comprising: (a) introducing into thecell a nucleic acid cutting entity capable of generating adouble-stranded break at a specified location in a nucleic acid moleculepresent inside a cell to produce a cleaved nucleic acid molecule, and(b) introducing a donor nucleic acid molecule into the cell, whereinstep (a) is performed before step (b) or wherein step (b) is performedbefore step (a).
 53. The method of claim 52, wherein the introduction ofthe nucleic acid cutting entity or the donor nucleic acid molecule intothe cell is mediated by electroporation.